samirelanduk / atomium

Python macromolecular parsing (with .pdb/.cif/.mmtf parsing and production)
https://atomium.bio
MIT License
102 stars 19 forks source link

HETATM records change to ATOM when saving model #30

Closed hippolytej closed 3 years ago

hippolytej commented 3 years ago

Hey, first of all thanks for this great tool.

I found the following issue with this use case: I'd like to extract a single model from a structure that has multiple ones as a pdb file.

Expected behaviour

The output files keeps the information of HETATM and ATOM records

HETATM    1  C   FVA A   1      -3.595   0.079   3.555  1.00  0.00           C  
HETATM    2  N   FVA A   1      -2.330  -0.205   1.496  1.00  0.00           N  
HETATM    3  O   FVA A   1      -3.911  -1.055   3.906  1.00  0.00           O  
HETATM    4  CA  FVA A   1      -3.501   0.435   2.070  1.00  0.00           C  
HETATM    5  CB  FVA A   1      -4.752   0.042   1.281  1.00  0.00           C  
HETATM    6  CG1 FVA A   1      -5.974   0.826   1.764  1.00  0.00           C  
HETATM    7  CG2 FVA A   1      -4.535   0.232  -0.221  1.00  0.00           C  
HETATM    8  O1  FVA A   1      -1.445   1.720   0.692  1.00  0.00           O  
HETATM    9  CN  FVA A   1      -1.409   0.501   0.859  1.00  0.00           C  
ATOM     10  N   GLY A   2      -3.315   1.072   4.387  1.00  0.00           N  
ATOM     11  CA  GLY A   2      -3.364   0.879   5.826  1.00  0.00           C  
ATOM     12  C   GLY A   2      -2.503   1.917   6.549  1.00  0.00           C  
ATOM     13  O   GLY A   2      -3.009   2.947   6.992  1.00  0.00           O  
ATOM     14  N   ALA A   3      -1.218   1.610   6.645  1.00  0.00           N  
ATOM     15  CA  ALA A   3      -0.282   2.504   7.305  1.00  0.00           C  
ATOM     16  C   ALA A   3       1.118   2.286   6.729  1.00  0.00           C  
ATOM     17  O   ALA A   3       1.488   1.161   6.395  1.00  0.00           O  
ATOM     18  CB  ALA A   3      -0.333   2.271   8.817  1.00  0.00           C  

Actual behaviour

All atom coordinates are written as ATOM records

ATOM      1  C   FVA A   1      -3.595   0.079   3.555  1.00   0.0           C  
ATOM      2  N   FVA A   1      -2.330  -0.205   1.496  1.00   0.0           N  
ATOM      3  O   FVA A   1      -3.911  -1.055   3.906  1.00   0.0           O  
ATOM      4  CA  FVA A   1      -3.501   0.435   2.070  1.00   0.0           C  
ATOM      5  CB  FVA A   1      -4.752   0.042   1.281  1.00   0.0           C  
ATOM      6  CG1 FVA A   1      -5.974   0.826   1.764  1.00   0.0           C  
ATOM      7  CG2 FVA A   1      -4.535   0.232  -0.221  1.00   0.0           C  
ATOM      8  O1  FVA A   1      -1.445   1.720   0.692  1.00   0.0           O  
ATOM      9  CN  FVA A   1      -1.409   0.501   0.859  1.00   0.0           C  
ATOM     10  N   GLY A   2      -3.315   1.072   4.387  1.00   0.0           N  
ATOM     11  CA  GLY A   2      -3.364   0.879   5.826  1.00   0.0           C  
ATOM     12  C   GLY A   2      -2.503   1.917   6.549  1.00   0.0           C  
ATOM     13  O   GLY A   2      -3.009   2.947   6.992  1.00   0.0           O  
ATOM     14  N   ALA A   3      -1.218   1.610   6.645  1.00   0.0           N  
ATOM     15  CA  ALA A   3      -0.282   2.504   7.305  1.00   0.0           C  
ATOM     16  C   ALA A   3       1.118   2.286   6.729  1.00   0.0           C  
ATOM     17  O   ALA A   3       1.488   1.161   6.395  1.00   0.0           O  
ATOM     18  CB  ALA A   3      -0.333   2.271   8.817  1.00   0.0           C  

Example code to reproduce

import atomium
structure = atomium.fetch("1GRM")
mod = structure.models[0]
mod.save("1GRM_model1.pdb")

Then compare with https://files.rcsb.org/view/1GRM.pdb

Python Version/Operating System

Python 3.6, atomium 1.0.7

samirelanduk commented 3 years ago

Thanks for flagging this - should be a relatively straightforward fix.