Open rvhonorato opened 1 year ago
note:
If we repeat the first line to simulate having two atoms before the first TER
, the TER
is not removed.
The same does not happen with the HETATM
entry.
Thanks for the report @rvhonorato, we'll have a look.
This is probably an edge case since the test pdb is not realistic and it works for "real" structures - anyway could be an indicative of some underlying issue.
Let me know if there's anyway I can help
I had a look at the format specification and it seems to hint that TER
statements do not apply after HETATM. Only at the terminus of a (linked) chain. Checking a couple of random PDBs does reinforce that:
Its indeed not very clear, looking at https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#TER
Every chain of ATOM/HETATM records presented on SEQRES records is terminated with a TER record.
and https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/tutorials/pdbintro.html
indicates the end of a chain of residues. For example, a hemoglobin molecule consists of four subunit chains that are not connected. TER indicates the end of a chain and prevents the display of a connection to the next chain.
And deeper into the SEQRES
record: https://www.wwpdb.org/documentation/file-format-content/format33/sect3.html#SEQRES
SEQRES records contain a listing of the consecutive chemical components covalently linked in a linear fashion to form a polymer. The chemical components included in this listing may be standard or modified amino acid and nucleic acid residues. It may also include other residues that are linked to the standard backbone in the polymer. Chemical components or groups covalently linked to side-chains (in peptides) or sugars and/or bases (in nucleic acid polymers) will not be listed here.
So that seems to imply to me that there is some relation between TER
and SEQRES
. Since the pdbs might not have this SEQRES
to pull the limits from, its probably ok follow the convention of always having TER
between chains of ATOM
and additionally a TER
between chain breaks (non-continuous numbering in ATOM
) using the strict options, which I think already exists, right?
Yes - better too few than too many TER statements.
Adding TER statement at any chain break (even within a chain is a dangerous thing since it implies there is a real end of the chain there - meaning some software will interpret it as there should be a charged termini)
and additionally a TER between chain breaks (non-continuous numbering in ATOM) using the strict options, which I think already exists, right?
Software that interprets the PDB format should cross-relate the TER
records and the SEQRES
to decide if its the true break or not - but its unlikely that this behaviour covers PDBs obtained from non-experimental methods, in that case (older) tools might just indeed assume its the OXT
.
+1 for less TER
in the sake of compability - but still the bug above is still relevant
Any news on this? Is it still relevant or implemented already?
Describe the bug
pdb_tidy
removes theTER
record between chains and removes lastENDMDL
in a multi-model PDB.To Reproduce
test.pdb
pdb_tidy test.pdb > tidy.pdb
Expected behavior
The
TER
records between the chains should be kept and the lastENDMDL
keptDesktop (please complete the following information):