hildebrandtlab / BiochemicalAlgorithms.jl

The Biochemical Algorithms Library in Julia
MIT License
13 stars 11 forks source link

Alternative Conformation #25

Open jeleclaire opened 1 year ago

jeleclaire commented 1 year ago

PDB and Pubchem files often contain alternative location for atom coordinates (or entire conformation). Mechanisms to switch between these alternative conformations are needed e.g., to generate an alternative atoms data frame. Particularly, interested with regard to the multiple variants present in the Fragment database.

jeleclaire commented 1 year ago

https://github.com/hildebrandtlab/BiochemicalAlgorithms.jl/blob/a6eeffc5c5ca02da54c80a2099a5db5b73c04d81/src/fileformats/pubchem_json.jl#L527-L544

Pubchem files can consists of several compounds which in turn can comprise several alternative conformations. Alternative conformations are stored in thee coordinates section of the file. Currently, our implementation stores different conformation in a molecule by writing all atom coordinates in the dataframe - conformations are distinguished by setting the field _frameid of the atom tuple to the index of the conformation.

However, in PDB files the _frameid is set to the model number: https://github.com/hildebrandtlab/BiochemicalAlgorithms.jl/blob/a6eeffc5c5ca02da54c80a2099a5db5b73c04d81/src/fileformats/PDB.jl#L81 The model number (or ID) is only available, if all atoms of a structure have an alternative location according to: PDB web site: In some cases selected residues or parts of residues may have alternate locations as determined by the experiment. Each alternate location of a particular atom is differentiated with a unique Alt ID. For example, the residue number Ser 9 in Chain D in PDB entry 1trz has two atoms, each with alternate IDs A and B. When all the atoms of a structure have multiple locations, they are presented as multiple models and assigned unique Model IDs, often seen in NMR structures (e.g., PDB ID 2kpq).

What exactly should _frameid describe? https://github.com/hildebrandtlab/BiochemicalAlgorithms.jl/blob/a6eeffc5c5ca02da54c80a2099a5db5b73c04d81/src/core/atom.jl#L3-L15 The different frames of a MD simulation? In this case, I would suggest to add a field _conformationid to the tuple Atom: