greenelab / pancancer-evaluation

Evaluating genome-wide prediction of driver mutations using pan-cancer data
BSD 3-Clause "New" or "Revised" License
9 stars 3 forks source link

Domain adaptation across cancer types #43

Closed jjc2718 closed 2 years ago

jjc2718 commented 2 years ago

This PR adds code to test a few simple domain adaptation methods (CORAL and TCA) on mutation prediction across cancer types. The idea for these methods is to apply an unsupervised DA algorithm to align the train data to the test data, then train our models on the aligned training data and evaluate on the test data.

I can't remember if WENDA transforms the data this way (I think it learns a set of feature weights, but I don't remember the details), but I think you should just be able to call your code with the same train/test data that we're using in CORAL and TCA.

Main code changes:

Most other changes are just boilerplate or things copied over from the mpmp repo.