gnina / models

Trained caffe models
82 stars 23 forks source link

Questions on gninatypes-format #14

Closed juliabuhmann closed 2 years ago

juliabuhmann commented 2 years ago

I understand that gninatypes-format are binary files with atom coordinates and atom types (maybe having a small section on the gninatype-format in a README would be super helpful, right now (afaik) this information is hidden in a closed github issue :)).

  1. Is there some functionality to convert the gninatype-format back into pdb (If I understand correctly, connectivity and amino acid information are lost in this format), but still, it would be nice to be able to visualize the content of gninatypes , for instance, in pymol.

  2. The context of my question is, that I am interested in knowing whether protein pdb and gninatypes in the crossDocked2020 dataset are aligned. eg. whether 2bvo_A_rec.pdb and 2bvo_A_rec_0.gninatypes have the same atom coordinates.

Even if the answer to 2. is yes, I would still be interested in 1., as I would like to explore the content of gninatype-files. I loaded them in with molgrid.ExampleProvider() and got coords and atom type indeces, but then, I did not know which atom type index maps to which atom type.

Thanks a lot for clarification!

dkoes commented 2 years ago

You can lookup the name of a type by indexing into the result of the get_types_names member function of the typer class. It isn't possible to regenerate a PDB file, since too much information is lost (e.g. atom types, residues), but it is straightforward to convert to xyz: https://github.com/gnina/scripts/blob/master/types2xyz.py

juliabuhmann commented 2 years ago

Thanks a lot for adding this new script! This is super helpful.

Just for completeness, I checked bvo_A_rec_0.gninatypes --> bvo_A_rec_0.xyz and compared it to 2bvo_A_rec.pdb in pymol, and atoms do align.