simonfqy / PADME

This is the repository containing the source code for my Master's thesis research, about predicting drug-target interaction using deep learning.
MIT License
42 stars 16 forks source link

Refactoring the code #10

Open simonfqy opened 5 years ago

simonfqy commented 5 years ago

The current code is okay but some of the scripts are too complicated to understand, like splits/splitters.py, metrics/__init__.py, ./NCI60_data/preprocess.py, often with large chunks of duplicated code. Some improvements are desired, especially in the splits/splitters.py, currently it does not allow some parameter combinations and uses assert() functions as a way to fail early. I will try to solve this problem in a more graceful manner.

Cleanups are also needed in some files.

Thresholding the continuous predictions to yield binary outputs is currently done in a hard-coded manner, which could be prone to errors. Will need to refactor it if necessary. Also the range estimation is implemented in DeepChem using Bayesian statistics, possibly I need to incorporate this into the code as well.

simonfqy commented 5 years ago

Now the code is much more modularized, though there are some remaining problems in some of the scripts, which I was a bit lazy to fix and simply chose a "quick and dirty" solution. Need to fix it. Since DeepChem is actively maintained but this repository might not be so, I need to decouple the two repos, such that no imports from DeepChem would be necessary for the repo to function correctly, i.e., make it self-contained without dependence on DeepChem.