devalab / DeepPocket

Ligand Binding Site detection using Deep Learning
MIT License
89 stars 26 forks source link

Understanding the Train Dataset for Training Part #22

Closed drorhunvural closed 1 year ago

drorhunvural commented 1 year ago

My question is simple, but I believe it will be useful for everyone to understand the paper better.

The following code block needs to be run to train the classification

python train.py -m model.py --train_types scPDB_train0.types --test_types scPDB_test0.types -i 200000 --train_recmolcache scPDB_new.molcache2 --test_recmolcache scPDB_new.molcache2 -r val0 -o /model_saves/val9 --base_lr 0.001 --solver Adam

Here is an example line of the train and test files as follows

1 50.69633356250253 -8.818796255105756 9.213237190116068 2bel_4/protein_0.gninatypes 2bel_4/cavity6.mol2

I have two questions.

First, what does the number 1 in the first part represent?

My second question is that does the last part of the dataset need to be in the train and test files? (Las part means: 2bel_4/cavity6.mol2) If I delete the 2bel_4/cavity6.mol2 in the last part, will the train part work or do I need the mol2files too? Isn't just the gninatype enough (2bel_4/protein_0.gninatypes)?

RishalAggarwal commented 1 year ago

1 represents the class, yes i believe the protein_0.gninatypes file should be enough though Im not sure if it will provide an error (semantic or syntax) in the code

mainguyenanhvu commented 1 year ago

Hello @drorhunvural have you re-run data preparation for a custom data? If yes, please help me.

I am trying to use the instruction to prepare data for training a new classifier. I have stuck in make_types step because I can't find train.txt and test.txt files.

Moreover, I have 4 questions:

  1. If I want to add several pdb files to the available scPDB dataset, how can I complete it?
  2. The instruction for preparing data only works for a single pdb file, does it? If not, I need to write a pipeline to wrap up it.
  3. How to prepare train.txt and test.txt files to run make_types.py?
  4. Could you please show me which file/folder needed inputting from previous to each step?

I am tried on this pdb.

Thank you very much.

p.s: I have asked in the issue https://github.com/devalab/DeepPocket/issues/26.