Cancer type holdout performance analysis

greenelab / pancancer-evaluation

Evaluating genome-wide prediction of driver mutations using pan-cancer data

BSD 3-Clause "New" or "Revised" License

9 stars 3 forks source link

Cancer type holdout performance analysis #78

Closed jjc2718 closed 1 year ago

jjc2718 commented 1 year ago

For the holdout cancer type analyses (e.g. #66), we wanted to look at which cancer types tend to perform well/poorly as the holdout set.

In the box plot, positive values mean the classifiers performed better on the training data than on the holdout cancer type, and vice-versa. The results largely make sense: TGCT and SARC are non-carcinomas so it's not surprising that generalization was poor, THCA only has classifiers for 2 genes and one is very undersampled, etc.

review-notebook-app[bot] commented 1 year ago

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.