samirelanduk / atomium

Python macromolecular parsing (with .pdb/.cif/.mmtf parsing and production)
https://atomium.bio
MIT License
103 stars 19 forks source link

PDB format output with numbers as chain ID #43

Open fwaibl opened 1 year ago

fwaibl commented 1 year ago

Hi. I am using atomium to extract molecules from mmCIF files and write them into PDB format. Generally, this works really well, but I encountered an issue with structures where the chain ID is a number instead of a letter.

Expected behaviour

The chain ID should not be written as part of the residue number, but only in the column reserved for the chain ID.

Actual behaviour

When the chain ID is a number, it is written into the PDB string twice (once as chain ID and once as part of the residue number). The resulting files are too broad for the PDB specification and are parsed badly by many other programs.

Example code to reproduce

import atomium
cif = atomium.fetch("6L4T")
lig = [l for l in cif.model.ligands() if l.id == "13.308"][0]
print(atomium.pdb.structure_to_pdb_string(lig))

Output (truncated):

HETATM20582  NB  KC1 1313308     208.930 314.544 325.109  1.00 90.18           N  
HETATM20583  ND  KC1 1313308     205.979 312.067 326.352  1.00 90.18           N  
HETATM20584  C1A KC1 1313308     208.131 312.489 328.676  1.00 90.18           C  
HETATM20585  C1B KC1 1313308     209.880 315.122 325.835  1.00 90.18           C  
HETATM20586  C1C KC1 1313308     206.761 314.055 322.987  1.00 90.18           C  
HETATM20587  C1D KC1 1313308     204.767 311.511 325.824  1.00 90.18           C  

Note that the chain ID ("13") is written twice.

Python Version/Operating System

I am using atomium 1.0.11 (from conda-forge) on Python 3.10 / Linux

Thanks in advance for your support, and thanks for publishing atomium as open-source :-)

samirelanduk commented 1 year ago

Thanks for flagging this - atomium 2.0.0 is nearing completion, so I will fix this issue for that release (likely next month) if it isn't already fixed there. I've overhauled the way saving is done generally.

fwaibl commented 1 year ago

Ok. Thanks for the info, I'm looking forward to the new version. Until then, I can work around it.