Accuracy of BindingDB pre-trained models

Hello,

I have been using the DTI virtual screening on a protein target using several compounds that had the highest binding affinity in bindingDB (my positive reference dataset) and several compounds that had the lowest binding affinity in bindingDB (my negative reference dataset). I ran all the pre-trained bindingDB models that are provided to see how their predictions compare to the actual IC50 values for both my positive and negative datasets. I found that the models predict much more accurately for the negative reference than the positive reference. For example, the cnn_cnn_bindingDB model predicted the IC50 correctly within a range of +/-1 for 75% of the negative reference compounds while it only predicted correctly within +/-1 for 10% of the positive reference compounds for the GRM5 target. I have observed a similar pattern using the same process for several other protein targets and the other bindingDB pre-trained models.

Do you have any thoughts on how the models were trained might be causing this unbalance? I thought that since I was providing test data that came from the same dataset that the models were trained on, the prediction would be fairly accurate on both ends of the IC50 range. Do you have any recommendations for changes to make to the parameters or other pieces of your code that might help?

Thank you!

kexinhuang12345 / DeepPurpose

Accuracy of BindingDB pre-trained models #162