NMRLipids / NMRlipidsIVPEandPG

NMRlipids IV project, PE and PG lipids
GNU General Public License v2.0
0 stars 7 forks source link

Isomers of PG lipids #38

Closed ohsOllila closed 3 years ago

ohsOllila commented 3 years ago

PG from avanti is racemic. We should check the isomers in simulations.

@patrickfuchs mentioned in the meeting that he may have a code or function for checking available. That would help the analysis when combined with the databank.

pbuslaev commented 3 years ago

I once uploaded simple tcl script to test this. I can rework it with python if needed.

patrickfuchs commented 3 years ago

Hi @pbuslaev, in your tcl script you use the vmd function measure imprp. Do you have it in Python? On my side, I thought about it. In buildH, in the function where we reconstruct two H on a CH2 carbon from the C(i-1)-C(i)-C(i+1) trace, we can easily calculate this angle (https://github.com/patrickfuchs/buildH/blob/362babf9fcfa76f405e520695e929d15de899d84/buildh/hydrogens.py#L48) without too much work. Apart from that, it'll be trivial to tell which H is pro-R and pro-S in the near future (this is something we discussed some while ago).

pbuslaev commented 3 years ago

Hi @patrickfuchs, no currently I don't have it in python. But that should be easy to implement. I am currently modifying the script for dihedral angle calcularion to make it work with MDAnalysis and I can add an option for imporpers there. However, if the isomers need to be checked asap, tcl script is a there and could be used.

patrickfuchs commented 3 years ago

Just to make it clear, when I said "without too much work", I meant it's just adding one line of code to the function above to get the angle. Thus, if you want, I can send you this function which you can incorporate into your script that goes over a trajectory. Let me know.

pbuslaev commented 3 years ago

Sure, that would be great if you send me the function.

pbuslaev commented 3 years ago

@ohsOllila Could you please share the list of POPG trajectories needed to be analyzed?

ohsOllila commented 3 years ago

Technically I do not have such a list because all these trajectories are already indexed in the new databank format. I have added a template python code which goes through trajectories containing POPG that are in the databank: https://github.com/NMRLipids/NMRlipidsIVPEandPG/blob/master/scripts/calcISOMERS.py

This script goes through all the data in the databank. If there is POPG in the simulation, it downloads the trajectory, makes lipids whole, searches the mapping file etc. Then you are ready to call your analysis function or write them within the code (this where the comment in lines 129-140 locates). This is how the analysis from the NMRlipids databank is supposed to work. Let me know if you have any questions. If you have your function, I can also try put it inside this code.

pbuslaev commented 3 years ago

Hi @ohsOllila,

If I understand it right, to test the enantiomers of POPG, it is enough to calculate the M_G3C4_M-M_G3C5_M-M_G3C5O1_M-M_G3C6_M improper for the 1` glycerol moeity. L- and D- enantiomers should result into two different peaks.

Here is the script that iterates over all trajectories with POPG and calculates this improper. I tested it for one trajectory and there is only one peak observed. Unfortunately I can't run the whole analysis on my machine currently. So if you find the script useful, please use it.

patrickfuchs commented 3 years ago

Hi @ohsOllila, @pbuslaev, I think I have a function that is able to infer the R-S configuration of an asymetric carbon. It took more time because it was slightly more difficult than expected. In fact, I don't use the classical improper definition like what we use for force fields. Instead I use the angle (that I call theta) between the vector bissecting the 3 atoms asymetric carbon / two heaviest atoms. Let me explain using a figure (I generated this from a CHARMM trajectory, this snapshot corresponds to the 1st POPG from the tpr file of the system, the asymetric carbon is called C12) :

infer_config_RS_POPG

The numbers in green indicate the 4 substituents by descending mass. Three vectors have to be determined :

In the end, I calculate the angle theta between vec_bissect and the vector central carbon -> 3rd heaviest atom. If theta is positive, the carbon is R, if negative it is S.

Do you agree with my reasonning ? The python function takes as argument the central carbon and the 4 substituents by descending mass as numpy arrays. The atom order is important.

@pbuslaev since you already implemented something, do you still need my function? It'll be useful anyway for buildH to tell which H is pro-R or pro-S.

Last question, do we really need to make this check over whole trajectories? In principle, the absolute config should not change during an all-atom trajectory. And for a united-atom traj, there should be an improper preventing the switch.

pbuslaev commented 3 years ago

Hi @patrickfuchs, I agree that the method you use for defining R- and S- conformation works. But I also think that calculation of impropers works as well. I also agree that we don't need to analyze the whole trajectories.

As for buildH, I agree that it might be useful to know in which conformation H is. If needed, I can probably work on that later as well.

patrickfuchs commented 3 years ago

Hi @patrickfuchs, I agree that the method you use for defining R- and S- conformation works. But I also think that calculation of impropers works as well. I also agree that we don't need to analyze the whole trajectories.

All right. I agree your version that uses impropers also works.

As for buildH, I agree that it might be useful to know in which conformation H is. If needed, I can probably work on that later as well.

OK thanks. I'll make you know if we need help.

ohsOllila commented 3 years ago

Thanks @pbuslaev and @patrickfuchs,

I agree with you. I think that for NMRlipids IVb the dihedral calculation is sufficient, but how we will handle the isomers in the databank is yet to be discussed. Approach by @patrickfuchs may be useful there.

I slightly modified to code by @pbuslaev to read only gro file and added flexibility to residue names, and ran the analysis over PG structures in the databank. They are all the R conformation which is the biologically abundant one (I would appreciate if someone would double check this from the data). The updated code is here https://github.com/NMRLipids/NMRlipidsIVPEandPG/blob/master/scripts/calcISOMERS.py and the dihedral results in files NMRlipidsIVPEandPG/Data/dihedral/////isomer_POPG_M_G3C4_M_M_G3C5_M_M_G3C5O1_M_M_G3C6_M.dat.

I have now udated the explicit molecule names in the method section in the manuscript, and added a paragraph in the SI: "The PG headgroup is biologically abundant R enantiomer in all simulations, while our 13C NMR experiments has a racemic mixture. Nevertheless, previous 2 H NMR experiments comparing results between different enantiomers concluded that the structural differences between these are minor. 3"

I think that we are done with this issue for NMRlipids IVb, but we have to come back to this in discussions about the databank.

patrickfuchs commented 3 years ago

I think that we are done with this issue for NMRlipids IVb, but we have to come back to this in discussions about the databank.

@ohsOllila, is there already an open issue on the databank project to start the discussion, or maybe somewhere on the blog? We started to think about it in the buildH project (https://github.com/patrickfuchs/buildH/issues/63), and realize that implementing a pro-R / pro-S assignement in a consistent way would probably require to change the def files. Also, we thought that it would be nice to have consistent .def files between NMRlipids projects and buildH.

ohsOllila commented 3 years ago

We do not have an own repository for the databank project yet. I have opened a issue about having enantiometric information in the databank into this repository: https://github.com/NMRLipids/NMRlipidsIVPEandPG/issues/41.

Idea is that we would not use the def files anymore in the databank. All the information should be read from the mapping files. We can continue the discussion in the new issue, but I think that we also need a online meeting about the databank soon.

patrickfuchs commented 3 years ago

Thanks @ohsOllila, so I continue the discussion there.

ohsOllila commented 3 years ago

I think that this issue is solved for NMRlipids IV. For the databank, I have opened a issue in the databank repository https://github.com/NMRLipids/Databank/issues/1. Therefore, I close this issue.