jyaacoub / MutDTA

Improving the precision oncology pipeline by providing binding affinity purtubations predictions on a pirori identified cancer driver genes.
https://drive.google.com/drive/folders/1mdiA1gf1IjPZNhk79I2cYUu6pwcH0OTD
2 stars 2 forks source link

H4H Directory structure and where to find things:

H4H:

Everything important (e.g.: model weights) should be accessible through the shared project directory.

I primarily used my home directory to store the code since I could sync it up with GitHub whereas on the shared project directory we have no internet access to perform “git pull/push” operations.

Thus for files/folders that were too large to store in HOME, I used symbolic links to folders located in the shared project directory. However, I forget exactly how I laid everything out and I no longer have access to the VPN to connect and check.

Nonetheless, all the important stuff we would need, like model checkpoints, should be stored in the shared project directory.

GitHub - https://github.com/jyaacoub/MutDTA/tree/main

Training splits can be found on the GitHub page as well as all my most recent code.

GitHub issues

All the issues we encountered with this project are tracked via GitHub. I list some of the more relevant issues below:

Summary of model checkpoints/issues (found in MutDTA/results/):

Basically the only ones that matter are results/model_checkpoints and v103. The rest are just some tests I did to resolve/debug issues.

OncoKB distribution drift issue with splits - Issue #131

When we originally started looking into OncoKB I selected highly targeted proteins from OncoKB to be excluded from training sets.

Stats on the distribution differences between the manually curated oncokb dataset split vs a random split can be found on the issue page.

Missing Amino Acids in PDBs for PDBbind - Issue#102

This means for the pocket versions of our models we can’t readily use existing scripts to get the pocket sequence graph based on the PDBs provided.

Pocket representation version of our models - Issue#103

This tracks how the pocket representation of Davis and Kiba models was built. The pull request 135 resolves this with the results in the CSV files.