greenelab / pancancer-evaluation

Evaluating genome-wide prediction of driver mutations using pan-cancer data
BSD 3-Clause "New" or "Revised" License
9 stars 3 forks source link

Run LASSO range experiments for Vogelstein gene set #66

Closed jjc2718 closed 1 year ago

jjc2718 commented 1 year ago

This PR expands #65 to all the genes in the Vogelstein cancer gene set, and modifies the analyses/visualizations from the last PR a bit to handle lots more genes.

In general, we see that for most genes, there are either positive or ~0 correlations between the number of features and generalization performance across cancer types, suggesting that if anything, models that include more features tend to generalize better. We're still exploring this and thinking of ways to summarize the results.

image

Code changes:

02_cancer_type_classification/lasso_range_analysis/lasso_range_gene.ipynb doesn't really need to be reviewed, it's mostly the same as the previous script that was moved.

review-notebook-app[bot] commented 1 year ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB