rcsb / mmtf-python

The python implementation of the MMTF API, decoder and encoder.
http://mmtf.rcsb.org/
Apache License 2.0
46 stars 25 forks source link

Missing bonds #62

Closed dkoes closed 1 year ago

dkoes commented 1 year ago

In PDB 6C94 there are bonds to the FE in the HEM residue from other residues (e.g. V16 and C448). There are CONECT records for these in the PDB file, but they are not present in the MMTF bond_atom_list. Are they stored somewhere else in the MMTF file? If not, what are the rules for omitting bonds?

pwrose commented 1 year ago

Bond to metals are included if they are part of a chemical component. Other bonds to metals, e.g., from interacting residues are not included. These bonds are omitted since they are specified inconsistently or sometimes incorrectly throughout the archive.

It's best to calculate these metal interactions based on distance criteria and handle them in a similar way to hydrogen bonds.

Here is a description from the MMTF paper:

"Bonds and bond orders for both standard and non-standard residues, e.g., ligands, are included from the Chemical Component Dictionary [17 and additional covalent bonds (struct_conn category in the PDBx/mmCIF files), such as disulfide bonds or covalent bonds between ligands and polymers are also included in MMTF. Metal coordination and hydrogen bond information is not included in MMTF, since there are no generally agreed upon standards how to define them. Fig 2) describes the creation of an MMTF file from a PDBx/mmCIF archive file."

dkoes commented 1 year ago

Thank you for the explanation. Where is the code that does the conversion from mmCIF to MMTF? I couldn't find it, and it would be good to see exactly what conditions trigger the bond removal.

pwrose commented 1 year ago

BioJava is used to convert the mmCIF to MMTF files.

The BondMaker class creates the bonds. It uses the struct_conn.conn_type_id field to select the bond types that are consistently and correctly represented in the mmCIF files. This excludes hydrogen bonds (hydrog) and metal center bonds (metalc). (see https://github.com/biojava/biojava/blob/457b6d833618b0bb69362276d852a34587e7b61f/biojava-structure/src/main/java/org/biojava/nbio/structure/io/BondMaker.java#L55-L70)