devalab / DeepPocket

Ligand Binding Site detection using Deep Learning
MIT License
89 stars 26 forks source link

How to prepare the inputs for training segmentation model? #12

Closed stgzr closed 2 years ago

stgzr commented 2 years ago

Well, since I could not find any code related to this issue, I wonder the details of preprocessing. I guess use the protein and the binding site to mask the ground truth. But which files did you use? Because in the scPDB dataset, there are many files such as protein.mol2, site.mol2, cavity6.mol2, ligand.mol2, etc. I am getting confused.

RishalAggarwal commented 2 years ago

you can check the files we used in our .types files. I believe the cavity6.mol2 files is what we used for training the segmentation model

mainguyenanhvu commented 1 year ago

Hello @stgzr have you re-run data preparation for a custom data? If yes, please help me.

I am trying to use the instruction to prepare data for training a new classifier. I have stuck in make_types step because I can't find train.txt and test.txt files.

Moreover, I have 4 questions:

  1. If I want to add several pdb files to the available scPDB dataset, how can I complete it?
  2. The instruction for preparing data only works for a single pdb file, does it? If not, I need to write a pipeline to wrap up it.
  3. How to prepare train.txt and test.txt files to run make_types.py?
  4. Could you please show me which file/folder needed inputting from previous to each step?

I am tried on this pdb.

Thank you very much.

p.s: I have asked in the issue https://github.com/devalab/DeepPocket/issues/26.