MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.32k stars 652 forks source link

We need keep original residue number from gromacs tpr file #4224

Open liuyujie714 opened 1 year ago

liuyujie714 commented 1 year ago

Question

When read topology information from gromacs tpr file, the discontinuous residue number is unsupported. such as, test.zip

gmx editconf -f test.tpr -o test.gro

Will get original discontinuous residue number in output gro(pdb, etc.):

   99PHE    HD2 1562   5.034   4.829   4.515  0.5966 -1.1182  0.7116
   99PHE      C 1563   5.218   5.011   4.689  0.1965  0.2139  0.1210
   99PHE    OC1 1564   5.311   4.970   4.616 -0.1532  0.0375 -0.2293
   99PHE    OC2 1565   5.159   5.120   4.673  0.1165  0.1239 -0.2090
  101PRO      N 1566   4.929   5.217   4.547  0.3156 -0.3804  0.4314
  101PRO     H1 1567   4.969   5.160   4.620 -0.1633 -0.5250  0.5838
  101PRO     H2 1568   4.983   5.211   4.461 -0.6419 -0.1703 -0.2059
  101PRO     CD 1569   4.790   5.179   4.517  0.4198 -0.0348 -0.53185318

99, 101 is corrected.

However, if I use mda to read top and save gro/pdb, the residue number is always continuous from atoms.resids.

In most situation, we need original residue number do somethings, so I recommend modify atoms.resids behavior for tpr or add new keywords resnrs to represent original number.

Related code

https://github.com/MDAnalysis/mdanalysis/blob/2bf55dd714292bbf7c12023a28153a6443a52ec6/package/MDAnalysis/topology/tpr/utils.py#L796

data.unpack_int() is need if fver >= 63, otherwise use i+1 for i in range(nres), like as tpxio.cpp in gromacs source: https://github.com/gromacs/gromacs/blob/0c5644d839bb80ac38f778e5e9e258e662a83642/src/gromacs/fileio/tpxio.cpp#L2359C1-L2372C6 ri[j].nr represents original residue number, used in output structure by gmx editconf, etc. tools

lilyminium commented 1 year ago

Thanks @liuyujie714 for opening this issue and for giving such detailed information on a fix! It sounds like you already have a very good idea of what to do to fix it. If you have time to open a PR, we would very much welcome help on this -- otherwise, it's definitely on our to-do list!

I recommend modify atoms.resids behavior for tpr or add new keywords resnrs to represent original number

Personally I think the former option is better, since this is more in the spirit of resids.