Open targetprotein101 opened 1 year ago
Hi, I really want to know how I should prepare all the input files. As in the files in protacs directory, I can prepare .pdb and .smi and I can even prepare .mol2 file. But, I want to know where and how I should put my data.
Also, I just run $ python main.py after the download, but I got this msg.
[] $ python main.py
Traceback (most recent call last):
File "main.py", line 10, in
Many thanks, in advance!
Hi, I really want to know how I should prepare all the input files. As in the files in protacs directory, I can prepare .pdb and .smi and I can even prepare .mol2 file. But, I want to know where and how I should put my data.
Also, I just run $ python main.py after the download, but I got this msg.
[] $ python main.py Traceback (most recent call last): File "main.py", line 10, in from protacloader import PROTACSet, collater File "//bin/DeepPROTACs/protacloader.py", line 3, in from torch_geometric.data import Batch ModuleNotFoundError: No module named 'torch_geometric'
Many thanks, in advance!
Hi, you might forget to install pytorch geometric, please see https://pytorch-geometric.readthedocs.io/en/latest/install/installation.html to install it.
Hi, I'd like to know how to prepare five different type of input files when I'm dealing with bunch of different PROTACs at the same time. I'm going to test 10 PROTACs and already prepared for mol2 files but I don't know how to merge in 1 each file. (5 different input, total)
Thanks, in advance!
Hi, if you have the .pdb
files, please organize your files as the protacs
files. Then please run prepare_data.ipynb
first, to get the right form of files. If you just want to test the new data, main.py
is not necessary, please use the case_study.ipynb
.
Don't forget to add the root
in the file, otherwise it will use the default data. Such as,
ligase_ligand = GraphData("ligase_ligand", root="test_samples")
If you want to specify the process, please see https://pytorch-geometric.readthedocs.io/en/latest/tutorial/create_dataset.html and the file prepare_data.py
.
Thanks. I've installed PyG. But I'm not familiar with Jupyter stuff. So I have bunch of questions.
I'm thinking of using this program in a huge project and I really want to use this. Would you please help me with this?
Thanks!
Thanks. I've installed PyG. But I'm not familiar with Jupyter stuff. So I have bunch of questions.
- I've looked into case_study.ipynb but there's so much empty spaces after ",". I don't know where should I put "root=" ") and the "test_samples" should be name of what? the name of PROTAC or what? Everything's too vague to me.
- Where should I prepare my data in which directory and how can I specify the location of input data?
I'm thinking of using this program in a huge project and I really want to use this. Would you please help me with this?
Thanks!
root
in the case_study.ipynb
. As default, the value of the root
is data
in PyG
, which means it will generate a new dir named data
and put the processed files in the data
dir. PyG
will check the name of the dir and the generated files to decide whether it should re-process the raw data. So, you should rename the root file as there is a data
dir already.pdb
or just Debug on VS code or other IDEs, to run the case_study.ipynb
and add breakpoints in the process
function of the GraphData
class. You can run it step by step to see how it process.
https://pytorch-geometric.readthedocs.io/en/latest/tutorial/create_dataset.html, this link will help a lot.I have 10 different PROTACs with different linkers. Do I have to make 10 different input directories in "protacs" directory?
Where should I prepare my data in which directory and how can I specify the location of input data? I don't understand any of what you said:
You can use the Python Debugger pdb or just Debug on VS code or other IDEs, to run the case_study.ipynb and add breakpoints in the process function of the GraphData class. You can run it step by step to see how it process. https://pytorch-geometric.readthedocs.io/en/latest/tutorial/create_dataset.html, this link will help a lot.
I already have mol2 files on everything as you said in case of using web server. (except linker smiles file). Is there any ways that I can skip the processing?
How can I run case_study.ipynb?
I already deleted 'data' directory in your package as you mentioned in other issues. In this case, I can still use the name 'data' for running, right?
Thanks.
I uploaded the single prediction version just now. You can rename your prepared files as ligase_ligand.mol2
, ligase_pocket.mol2
, target_ligand.mol2
, target_pocket.mol2
and linker.smi
. Then put them into a dir like single_test
and run the single_prediction.py
plus the dir name for testing (one at a time), such as python single_prediction.py single_test
. You do not need to use Jupyter files or change the root values in this way.
Thanks.
I already prepared for .smi and .pdb just like you did in /DeepPROTACs/protacs/1_BRD7_VHL Can't I just use .smi for linkers and protacs for a single test?
I am sorry I'm afraid that you can't use the .smi
for linkers and PROTACs directly. The separation of E3 ligand, warhead and linker is needed as our model needs them as independent inputs. Also, I have tried to split the PROTACs using the linkers and PROTACs using RDKit, but sometimes it fails.
No. That's not what I meant. i mean, for the input in the directory, DeepPROTACs/protacs/1_BRD7_VHL , You put the sample data as linker_1.smi, protac_1.smi, and two PDB files. Why can't I use the same input types for running?
I'm sorry that I didn't mention the difference. In the directory, DeepPROTACs/protacs/1_BRD7_VHL, the two PDB files are all containing the ligands, and we name the chains as ABCD in order, so that we can split the ligands and the pockets. However, in single test, you do not need to consider the name of the chain, but you should split the ligands and the pockets manually.
Hi, I'd like to know how to prepare five different type of input files when I'm dealing with bunch of different PROTACs at the same time. I'm going to test 10 PROTACs and already prepared for mol2 files but I don't know how to merge in 1 each file. (5 different input, total)
Thanks, in advance!