Closed tachyon3903 closed 3 years ago
Hi, do you want to make a prediction or want to train a model? Yeah, affinity score is basically the fitness of drug-target. You can use any dataset you want.
I want to make a prediction using a trained model. I simply don't have the affinity score. Is it necessary? Thanks a lot for your time.
FYI: Currently, all the data I have is a bunch of SMILES molecules and a single FASTA.
Yeah, affinity score is like the label. So to train a new model, you have to have a label for this task. In your use case, I think you can use a pretrained model. Here is an example: https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/case-study-I-Drug-Repurposing-for-3CLPro.ipynb
Note that it is recommended to train a new model for a specific target of interest if you have some affinity scores data, because it is likely that the pretraining dataset may not contain similar target as yours, leading to bad generalizability.
Sorry for kept bothering you. How to input my own data in https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/case-study-I-Drug-Repurposing-for-3CLPro.ipynb
Quite frankly, I can't find a way to input my own data. Once again, thank you a lot for your help.
BTW, I think there is a bug in the "dataset.read_file_repurposing_library" command. When I attempt to input my own SMILES.txt, it gives "UnboundLocalError: local variable 'file' referenced before assignment." Thanks.
Hi, yeah, here is a tutorial on the data processing: https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/load_data_tutorial.ipynb
This should help you know the format of each function input and output.
Thanks a lot for your help! I really appreciate them. The codes are working now. Is the binding score (in DTI) a probability? Also, what is the binding score in https://github.com/kexinhuang12345/DeepPurpose/blob/master/DEMO/case-study-I-Drug-Repurposing-for-3CLPro.ipynb code? Since I do not have enough data to test, I was wondering if you would know that. Thanks! Your advice are really helpful.
In binary case, yes, it is a probability. In regression, the score is the true binding score in various kinds of units, depending on the training dataset. In the notebook, it is nM. But again, the result is using pretrained model, the generalizability is not guaranteed.
Thank you! Without your help, I would not be able to run the code!
No problem! Closing for now
Hello. When I am running the code, I have to re-train the model every time. Do you have a pre-trained model so that I can just load it? Also, what exactly is the virtual screening function? Sorry to bother you again.
Hi, you can load any model by saving and loading it. Checkout cell 15,16 in https://github.com/kexinhuang12345/DeepPurpose/blob/master/Tutorial_1_DTI_Prediction.ipynb. Virtual screening is to predict based on many drug-target pairs.
Thanks. Which model do you recommend? I am trying to do a prediction between COVID-19 RNA polymerase and drug about COVID-19 (e.g. Remdesivir).
I was trying to run the PPI model. What are the requirements? Thanks a lot.
I am currently creating new SMILES vectors using ChemGAN. According to the tutorial about DTI, I need to give a affinity score. What is that? Can I use my own database without KIBA or DAVIS?