NMRLipids / NMRlipidsIVPEandPG

NMRlipids IV project, PE and PG lipids
GNU General Public License v2.0
0 stars 7 forks source link

Molecule names in the databank #17

Closed ohsOllila closed 3 years ago

ohsOllila commented 4 years ago

To speed up the analysis of the trajectories in the NMRlipids IVb project, we will build a beta version of NMRlipids databank containing the trajectories related to this project. For this, and for the future, we need to decide how system composition of each simulation will be stored in the databank. Beta version of the databank builder code is here and the databank itself is here.

Current plan is to have a list of molecule abbreviations that uniquely point to certain molecules, such as POPC etc. By default, we assume that these are used in the simulation, but if this is not the case contributor has to change these names when adding simulations. Currently we have these molecules and list will increase when we add more data.

@POPC=POPC @POPG=POPG @POPS=POPS @POPE=POPE @POT=POT @SOD=SOD @CLA=CLA @CAL=CAL @TIP3=TIP3

During the online discussion it was pointed out that sometimes different parts within the same molecule may have different "residue" names. This is the case at least in current Amber convention. Because we aim to use these names only to count the molecules in the systems while generating the dictionary, it sufficent to name here only one of these residues. For example, for POPC it would be enough to name here only the headgroup. To hande such cases during the analysis then, we will add a third column into the mapping file where the atom specific residue names will be then given.

Any comments or opinions about this are welcomed.

ohsOllila commented 3 years ago

The databank is now moved to its own repository (https://github.com/NMRLipids/Databank) and this issue is solved approximately as described above. Therefore, I will close this issue.