t7morgen / misato-dataset

GNU Lesser General Public License v2.1
172 stars 17 forks source link

Correct h5 to PDB residue index error #9

Closed pujaltes closed 9 months ago

pujaltes commented 9 months ago

Fixes #8.

As mentioned there, the h5_to_pdb.py incorrectly splits some GLN and ASN residues when converting to pdb format. While the script accounts for this by ignoring index 12 and 9 in GLN and ASN respectively it misses the fact that the O-N pair can be in another location within the AA. From atoms_name_map_for_pdb.pickle we can see that this can also occur at indices 14 (GLN) and 11 (ASN).

This is a simple fix but I think it would be a good idea to look into cleaning up and refactoring the format conversion code to make it easier to understand.