rdkit / rdkit

The official sources for the RDKit library
BSD 3-Clause "New" or "Revised" License
2.69k stars 881 forks source link

PDB output only writes a TER for the last chain #3090

Open adalke opened 4 years ago

adalke commented 4 years ago

MolToPDBBlock supports "flavor & 32 : Write TER record".

With Python 2020.03.01, this TER record is only written for the last chain, when I think it should occur after every chain.

>>> import rdkit; print(rdkit.__version__)
2020.03.1
>>> from rdkit import Chem
>>> print(Chem.MolToPDBBlock(Chem.MolFromSequence("AC.GP"), flavor=2|8|32))
ATOM      1  N   ALA A   1       0.000   0.000   0.000  1.00  0.00           N
ATOM      2  CA  ALA A   1       0.000   0.000   0.000  1.00  0.00           C
ATOM      3  C   ALA A   1       0.000   0.000   0.000  1.00  0.00           C
ATOM      4  O   ALA A   1       0.000   0.000   0.000  1.00  0.00           O
ATOM      5  CB  ALA A   1       0.000   0.000   0.000  1.00  0.00           C
ATOM      6  N   CYS A   2       0.000   0.000   0.000  1.00  0.00           N
ATOM      7  CA  CYS A   2       0.000   0.000   0.000  1.00  0.00           C
ATOM      8  C   CYS A   2       0.000   0.000   0.000  1.00  0.00           C
ATOM      9  O   CYS A   2       0.000   0.000   0.000  1.00  0.00           O
ATOM     10  CB  CYS A   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     11  SG  CYS A   2       0.000   0.000   0.000  1.00  0.00           S
ATOM     12  OXT CYS A   2       0.000   0.000   0.000  1.00  0.00           O
ATOM     13  N   GLY B   1       0.000   0.000   0.000  1.00  0.00           N
ATOM     14  CA  GLY B   1       0.000   0.000   0.000  1.00  0.00           C
ATOM     15  C   GLY B   1       0.000   0.000   0.000  1.00  0.00           C
ATOM     16  O   GLY B   1       0.000   0.000   0.000  1.00  0.00           O
ATOM     17  N   PRO B   2       0.000   0.000   0.000  1.00  0.00           N
ATOM     18  CA  PRO B   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     19  C   PRO B   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     20  O   PRO B   2       0.000   0.000   0.000  1.00  0.00           O
ATOM     21  CB  PRO B   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     22  CG  PRO B   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     23  CD  PRO B   2       0.000   0.000   0.000  1.00  0.00           C
ATOM     24  OXT PRO B   2       0.000   0.000   0.000  1.00  0.00           O
TER      25      PRO B   2
END

I expected there to be a TER record for chain A between these two lines:

ATOM     12  OXT CYS A   2       0.000   0.000   0.000  1.00  0.00           O
ATOM     13  N   GLY B   1       0.000   0.000   0.000  1.00  0.00           N

It's difficult to point to the PDB format specification as definitive because much of PDB parsing is ad hoc/informal. Still, my memories from 20+ years ago say that the current RDKit behavior is not what most people will expect.

github-actions[bot] commented 1 week ago

This issue was marked as stale because it has been open for 90 days with no activity.