Training data creation - Githubissues

simonfqy / PADME

This is the repository containing the source code for my Master's thesis research, about predicting drug-target interaction using deep learning.

MIT License

41 stars 16 forks source link

The SMILES format is converted to RDKit Mol object and then converted to ECFP (in this case, Morgan Fingerprint, which is nearly identical to ECFP) in this line: https://github.com/simonfqy/PADME/blob/e01c592cc06c4de04b3ed6db35da5af5ff7f863f/dcCustom/feat/fingerprints.py#L23. As for the binding affinity scores, I obtained the info from some publicly available datasets. They are then processed in thepreprocess.py files in each dataset folder, like here: https://github.com/simonfqy/PADME/blob/e01c592cc06c4de04b3ed6db35da5af5ff7f863f/davis_data/preprocess.py#L35. The log transformation is done in the same file.

simonfqy / PADME