samsledje / ConPLex

Adapting protein language models and contrastive learning for highly-accurate drug-target interaction prediction.
http://conplex.csail.mit.edu
MIT License
121 stars 33 forks source link

About the bindingdb training set #31

Closed zh2417 closed 1 year ago

zh2417 commented 1 year ago

Why did you only use a portion of the bindingdb dataset for training? As far as I know, this dataset is large and seems to have approximately 2 million molecules related to the target. Looking forward to your reply

zh2417 commented 1 year ago

Hello, I would like to know how you can obtain the label for the bindingdb dataset, image What are the values of threshold and y?thanks

samsledje commented 1 year ago

The binary data sets were taken from MolTrans for comparison, please consult their report for more detailed information on data set generation.

The affinity data was taken from the Therapeutics Data Commons DTI DG benchmark