Transform part in Train

drorhunvural commented 1 year ago

    for b in range(batch_size):
            center = molgrid.float3(float(centers[b][0]), float(centers[b][1]), float(centers[b][2]))
            #intialise transformer for rotaional augmentation
            transformer = molgrid.Transform(center, 0, True)
            #center=transformer.get_quaternion().rotate(center.x,center.y,center.z)
            # random rotation on input protein
            transformer.forward(batch[b],batch[b])
            # Update input tensor with b'th datapoint of the batch
            gmaker.forward(center, batch[b].coord_sets[0], input_tensor[b])

The above code takes part in train.py. Can you explain why you are using this code please?

Why do we need to rotate or another processes on coordinates data while we train our data?

I am asking this question because

transformer.get_rotation_center().xis equal to centers[b][0] transformer.get_rotation_center().yis equal to centers[b][1] transformer.get_rotation_center().zis equal to centers[b][2]

What is your main goal by using transformer here ?

RishalAggarwal commented 1 year ago

Transform is used to apply a random rotation to the input protein around the center

drorhunvural commented 1 year ago

Does this provide more information to the model to improve accuracy? but how?

With these questions, I hope that your valued paper will be cited more because people understand better in the light of your answers. Thank you.

RishalAggarwal commented 1 year ago

Its basically for data augmentation and to make the model more robust to random rotations along any axis passing through the center.

drorhunvural commented 1 year ago

Great answer

mainguyenanhvu commented 1 year ago

Hello @drorhunvural have you re-run data preparation for a custom data? If yes, please help me.

I am trying to use the instruction to prepare data for training a new classifier. I have stuck in make_types step because I can't find train.txt and test.txt files.

Moreover, I have 4 questions:

If I want to add several pdb files to the available scPDB dataset, how can I complete it?
The instruction for preparing data only works for a single pdb file, does it? If not, I need to write a pipeline to wrap up it.
How to prepare train.txt and test.txt files to run make_types.py?
Could you please show me which file/folder needed inputting from previous to each step?

I am tried on this pdb.

Thank you very much.

p.s: I have asked in the issue https://github.com/devalab/DeepPocket/issues/26.

devalab / DeepPocket

Transform part in Train #17