Closed pujaltes closed 9 months ago
I added now https://github.com/t7morgen/misato-dataset/blob/master/src/data/processing/h5_to_traj.py, which should be in general more robust because it conserves the AMBER topology format (with all the atom Names, TERs etc.).
The
h5_to_pdb.py
incorrectly splits someGLN
andASN
residues when converting to pdb format. See example when converting4KNB.pdb
:In the script, the residue index is increased (a new residue has begun) when there is an O-N pair in the atom sequence. However, as pointed out here
GLN
andASN
contain an O-N within the AA. While the script accounts for this by ignoring index 12 and 9 inGLN
andASN
respectively it misses the fact that the O-N pair can be in another location within the AA. Fromatoms_name_map_for_pdb.pickle
we can see that this can also occur at indices 14 (GLN
) and 11 (ASN
).