gnina / libmolgrid

Comprehensive library for fast, GPU accelerated molecular gridding for deep learning workflows
https://gnina.github.io/libmolgrid/
Apache License 2.0
137 stars 45 forks source link

process froze when calling next_batch #79

Open yuanqidu opened 2 years ago

yuanqidu commented 2 years ago

When I call next_batch, the process froze forever.

dkoes commented 2 years ago

can you provide example code and data? Do the tests pass?

yuanqidu commented 2 years ago

Thanks for your quick response. I just solved this problem by manually installing libmolgrid.

However, I have another question. The provided dataset has many files end with ginatype, how could I get sdf.gz from the ginatype files?

dkoes commented 2 years ago

You can't. The gninatypes files contain the bare minimum needed for training (x,y,z and atom type) for efficient training. You can convert them to xyz files.
https://github.com/gnina/scripts/blob/master/types2xyz.py

yuanqidu commented 2 years ago

Thanks for your help!

I have a further question, what is the struct object required for libmolgrid? Does this package support pocket discovery step for protein-ligand binding?

dkoes commented 2 years ago

molgrid creates atomic density grids from molecules. That's it. You could use it as part of a neural network classifier for pocket identification, but that is up to you to develop.

yuanqidu commented 2 years ago

I see. Thanks again! How are we supposed to prepare the structs as specified by the example?

image

Also, may I ask whether the crossdock dataset identify pockets or does it just provide full protein and ligand?

dkoes commented 2 years ago

If relative paths are provided in the training file (fname), then data_root is prepended to the file path. The training file can refer to regular molecular data files in it (e.g. pdb, sdf, mol2, xyz). Each line is a training example with labels (first columns) and files names of the molecular data.

When given a receptor and ligand, the ligand defines the binding pocket.