greenelab / pancancer-evaluation

Evaluating genome-wide prediction of driver mutations using pan-cancer data
BSD 3-Clause "New" or "Revised" License
9 stars 3 forks source link

Drug response regression, stratified by cancer type #58

Closed jjc2718 closed 2 years ago

jjc2718 commented 2 years ago

The main goal of this PR is to build out regression (predicting continuous values) in addition to classification (predicting binary values) for drug response. I've applied it here to stratified cross-validation, as a way to make sure everything's working smoothly and we're generally doing better than our baselines, which we do seem to be.

Here's a plot comparing the true labels with shuffled labels across a few drugs, measuring performance using Spearman correlation between predictions and true labels:

image

So in most cases the blue boxes are considerably higher than the orange boxes, which is good! Next we'll try the same holdout experiments as before (liquid vs. solid, and single cancer type holdouts) using regression as well.

Major code changes:

review-notebook-app[bot] commented 2 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB