NMRLipids / NMRlipidsIVPEandPG

NMRlipids IV project, PE and PG lipids
GNU General Public License v2.0
0 stars 7 forks source link

Analysing the structural differences of headgroup between different lipids #9

Closed ohsOllila closed 3 years ago

ohsOllila commented 4 years ago

As discussed in the manuscript, CHARMM36 reproduces the essential differences in order parameters between PC, PE, PG and PS lipids. Therefore, it is reasonable to analyze the structural differences between headgroups from CHARMM36 simulations.

We already have a visualization of snapshots, but at least dihedral distributions are needed.

Also, the analysis of P-N angle with respect to membrane normal could be useful.

Also, more ideas on useful and illustrative quantities to compare between headgroups are welcomed.

pbuslaev commented 4 years ago

I have the code to calculate the distributions and I will do that soon. I will also check again the snapshots for head group conformations. I also can add the possibility to calculate P-N vector angle to the code. Will comment about all that soon

pbuslaev commented 4 years ago

Dear Samuli, I have checked the order for the structural figures and also recalculated the distributions for the head groups. It appeared that my script calculated the dihedrals incorrectly. For now I used the tcl script to calculate dihedrals and the results look reasonable. I will try to find the bug in the script and upload the new version soon. Please find the figures attached. confs cc1 co1

pbuslaev commented 4 years ago

As for P-N vector angle code - @jmelcr has commented on the related issue and I will try to use his code to plot this figures as well.

markussmiettinen commented 4 years ago

Hi Pavel!

It appeared that my script calculated the dihedrals incorrectly. For now I used the tcl script to calculate dihedrals and the results look reasonable. I will try to find the bug in the script and upload the new version soon.

Did the bug affect the PS-paper dihedrals too?

pbuslaev commented 4 years ago

Hi Markus!

No! I used the script that I used for PS paper to test the new data. And it appeared that new code has a bug. Thus I have recalculated the distributions for PE,PG and PC using the same code I used for PS paper.

The new code is more general and faster, thus I want to use it in the future. But for sure I have to make it work right.

ohsOllila commented 4 years ago

Thanks to @pbuslaev for the figure. I have now added it to the manuscript, but we still need to write the discussion. Could you also list here the trajectories (DOI links to Zenodo) used to make this figure?

pbuslaev commented 4 years ago

Hi Samuli!

As I mentioned in the email:

  1. I used my trajectory for POPC lipids https://zenodo.org/record/3474863#.Xh2646axVHc

  2. POPE-POPG - links from the PEPG manuscript for CHARMM36 trajectory POPE https://zenodo.org/record/2641987#.Xh2936axVHc POPG https://zenodo.org/record/1011096#.Xh2-AqaxVHc

  3. POPS - link from POPS manuscript for CHARMM36 with potassium ions POPS https://zenodo.org/record/1129415#.Xh29gaaxVHc


The code for the dihedral calculation is now working. I need to clean it up an then I'll upload it.


I have also used the code to get the distributions of P-N angles. Please find the normalized angle distributions attached pn1

.

ohsOllila commented 4 years ago

I finally had time to continue working on this. I have now slightly update the figures and text in the manusript based on contributions in this issue. I have here few questions to @pbuslaev:

1) Do you have the P-N angle vector calculation code within the same code as dihedral analysis? Could you share the code? I do not mind if it is not cleaned. We are also in process to build the new databank structure, see: https://github.com/NMRLipids/NMRlipidsVIpolarizableFFs/issues/3. I would like to start to think how to incorporate it there.

2) Why is the integral over normalized P-N angle distributions of different lipids not the same in the figure?

ohsOllila commented 4 years ago

While the structural differences between PS and other lipids are evident in the overlayed snapshot figures contributed here by @pbuslaev, the differences between PC, PE and PG lipids observed in dihedral distributions are not very well visible in overlayed snapshots. I am not sure if these are possible to make visible in such plots, or should we consider other kind of illustration of structural differences. I would be happy to have something more visually appealing in addition to the dihedral distributions.

pbuslaev commented 4 years ago

Dear Samuli,

  1. The code for P-N angle calculation is discussed here #11. I have slightly changed it in order to get the P-N angle trajectory. If you want this extended version of P-N angle calculation, I can share it with you.

  2. There are reasons for that.

First, for some reason I plotted only range from 0 to 90 degrees, while angles could be in range from 0 to 180. I have now changed that.

Second reason I should probably explain it in more details. P-N angle is an angle in 3D space. Thus for the fixed value of P-N angle (theta, colatitude) we have multiple possibilities for longitude (phi). And we can plot two distributions: a. p(theta). In this case the integrals should be equal, but the origins for small angle will have much less points than origins with large angle. The distributions for these case are plotted here

pn

b. p(theta,phi). In this case the integrals of p(theta,phi)*sin(theta) should be equal. The distributions are plotted here

pn1

  1. Regarding the structural representation, I can add the populations of every conformation to the plot. Do you think that might be helpful?
ohsOllila commented 4 years ago

Thanks @pbuslaev

  1. Regarding the code, we are currently working on this with @akiirik in relation to the databank, so I would be very happy if you could share the code.

  2. Thanks, this explains it. I think that I was mainly confused because the x-axis limits. I think that we should probably use the angle normalized plots. The angle averaging issue is very important (and sometimes confusing), but often ignored, when interpreting such distributions.

  3. Populations might help. If you can make a sample figure fast it would be nice. On the other hand, I think that we should analyze more dihedrals, and also try to think other ways to visualize the results.

pbuslaev commented 3 years ago

Dear Samuli,

I am very sorry for late response. Finally managed to reach the data and work on it.

  1. As for PN vector calculation, I used a slightly modified version of the code discussed in #11. https://github.com/NMRLipids/MATCH/blob/master/scripts/calcOrderParameters.py

I changed the output to collect PN angle for every frame instead of calculating averages.

My code is attached. calcOrderParameters.py.zip

  1. Here is a sample figure hg_popg1_prob

To calculate the dihedral distributions I used this script (probably slightly updated version) https://github.com/NMRLipids/NMRlipidsIVPEandPG/tree/master/scripts/show_lipid_structures

It allows calculation of all possible dihedrals. They are passed as a parameter file.

I also tried to plot dihedrals with densities, but that did not help in my opinion.

test_density

I will play with plotting again. I will share any ideas, that I have.

ohsOllila commented 3 years ago

Thanks Pavel.

We already figured out the P-N vector angle calculation and we have now a code that goes through the databank and calculates P-N vector angles for all simulations: https://github.com/NMRLipids/NMRlipidsIVPEandPG/blob/master/scripts/calcPNvectors.ipynb I have now closed the related issue https://github.com/NMRLipids/NMRlipidsIVPEandPG/issues/11.

We have also a code which goes through the databank and calculates the dihedral angles: https://github.com/NMRLipids/NMRlipidsIVPEandPG/blob/master/scripts/calcDIHEDRALS.py However, this has memory issues due to the properties of mdtraj. Maybe you would have a solution for this?

Anyway, I circumvented the memory issue for this manuscript using the gmx angle and I have calculated a lot of dihedral angles now. I think now that we will focus on the first six heavy atom dihedrals starting from the headgroup, where we see the most differences between different lipids, see figure 2 in the current manuscript. I think that the differences in O_alpha-C_alpha dihedral are difficult to visualize because it is rather flexible. Nevertheless, I think that the figure where you show the percentages could work for the C_alpha-C_beta dihedral.

Could you make a new figure where you have only the bottom line (no glycerol backbone part) of this figure including the percentages, and also changing the PG O_beta carbon to C_gamma carbon : https://raw.githubusercontent.com/NMRLipids/NMRlipidsIVPEandPG/master/Figs/PCPEPGPSstructures.png ?

Because there is no difference in the glycerol backbone region between lipids, I think that we do not need make those figures.

pbuslaev commented 3 years ago

Hi @ohsOllila!

Great that you have scripts and sorry that I did not help with that.

I guess one way to overcome memory issue, would be to keep the distribution in memory instead of dihedrals. Now, you are calculating the dihedral for the whole trajectory:

dihRESULT = []
for residue in traj.topology.residues:
    atom1ind=traj.topology.select("name == " + atom1 + " and resid == " + str(residue.index))
    atom2ind=traj.topology.select("name == " + atom2 + " and resid == " + str(residue.index))
    atom3ind=traj.topology.select("name == " + atom3 + " and resid == " + str(residue.index))
    atom4ind=traj.topology.select("name == " + atom4 + " and resid == " + str(residue.index))
    if(len(atom1ind) > 0):
        index[residue.index].append(atom1ind[0])
        index[residue.index].append(atom2ind[0])
        index[residue.index].append(atom3ind[0])
        index[residue.index].append(atom4ind[0])
    dihRESULT.append(mdtraj.compute_dihedrals(traj,[index[residue.index]]))

Instead you can calculate dihedral for only one residue and immediately recalculate it into distribution:

dihHIST = np.zeros(360)
for residue in traj.topology.residues:
    atom1ind=traj.topology.select("name == " + atom1 + " and resid == " + str(residue.index))
    atom2ind=traj.topology.select("name == " + atom2 + " and resid == " + str(residue.index))
    atom3ind=traj.topology.select("name == " + atom3 + " and resid == " + str(residue.index))
    atom4ind=traj.topology.select("name == " + atom4 + " and resid == " + str(residue.index))
    if(len(atom1ind) > 0):
        index[residue.index].append(atom1ind[0])
        index[residue.index].append(atom2ind[0])
        index[residue.index].append(atom3ind[0])
        index[residue.index].append(atom4ind[0])
    dihResid = mdtraj.compute_dihedrals(traj,[index[residue.index]]))
    dihResid = [make_positive_angles(x) for x in dihResid ]
    dihHIST += np.histogram(dihResid,np.arange(0,360,1))[0]

Hope that helps.

I will work on the figures and share them later this week.

pbuslaev commented 3 years ago

Hi @ohsOllila

Here is the plot

confs

Note, that for PS the probabilities sum up to 99.9%. That is due to the rounding error.

hsantila commented 3 years ago

At least in my case, the memory issue manifested when uploading the trajectory. I circumvented this by removing all the waters before calculating the dihedrals. Another option is to load the trajectory in chunks (https://mdtraj.org/1.9.4/api/generated/mdtraj.iterload.html#mdtraj.iterload) but this will make the analysis slower. I'd consider switching to MDAnalysis since I've never encountered memory issues there.

ohsOllila commented 3 years ago

MDAnalysis is indeed better in this respect, but it does not have a readily available function that would calculate dihedrals.

I tried to divide trajectory in chunks, but it became very slow indeed, but this could have been because of my programming skills.

Removing waters may be problematic for the solution which would generally work for the databank with simulations from different programs.

For NMRlipids IVb, it seems now that we do not need dihedrals from other than simulations run with Gromacs, but in the future, the most convenient solution may be to write a dihedral calculation function for MDAnalysis.

pbuslaev commented 3 years ago

There is a class for computing dihedrals in MDAnalysis: class MDAnalysis.analysis.dihedrals.Dihedral. I used it for my needs and I think it could be used for membranes as well.

ohsOllila commented 3 years ago

Hi @ohsOllila

Here is the plot

confs

Note, that for PS the probabilities sum up to 99.9%. That is due to the rounding error.

Thanks @pbuslaev. I have now included this into the figure 2 in the new version of the manuscript. However, a more narrow figure might work better in this layout. Could you make a version where POPG and POPS figures would on the second row, or send a version where these could be moved? I would like to have C) reaching from top to bottom and all the rest in the left column in current figure 2 in the manuscript.

pbuslaev commented 3 years ago

Hi @ohsOllila, I have modified calcDIHEDRALS.py script. Now it uses MDAnalysis instead of mdtraj. Here it is.

While testing, I also noted, that there is a typo in POPC GROMOS mapping file. I guess that M_G3P2O1_M and M_G3P2O2_M should have O instead of 0.

pbuslaev commented 3 years ago

Hi, by the way, the script calcDIHEDRALS.py that I uploaded, will help to overcome the Gromacs issue that was discussed during the NMRlipids meeting of 26.02.2021

markussmiettinen commented 3 years ago

Hi Pavel. Are you referring to the Gromacs issue with regards to calculating the density profiles and from them the form factors?

On 26. Feb 2021, at 13:51 , pbuslaev <notifications@github.com mailto:notifications@github.com> wrote:

Hi, by the way, the script calcDIHEDRALS.py that I uploaded, will help to overcome the Gromacs issue that was discussed during the NMRlipids meeting of 26.02.2021

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NMRLipids/NMRlipidsIVPEandPG/issues/9#issuecomment-786629140, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLU6CX3TPXCKHWQQLGNHGLTA6KNZANCNFSM4JH62VYQ.

pbuslaev commented 3 years ago

Hi @markussmiettinen . I had a feeling that there is a need to get rid of Gromacs dependency for all the analysis. Currently, as far as I understand, dihedrals are calculated with gmx angle. Thus, we are limited to the trajectories compatible with gromacs. I share a calcDIHEDRALS.py that works with MDAnalysis. So that one would solve the issue with Gromacs. That has nothing to do with the densities. But I can talk to the Gromacs core developers soon, so if you share the details of the problems with gmx density, I can raise this issue during our discussion. That probably will lead to the solution sooner.

markussmiettinen commented 3 years ago

Hi @pbuslaev! Thanks for the clarification! Concerning the problems with gmx density, @hsantila's bugreport is here and her earlier question on the mailing list here. Thanks a lot in advance for the help!

ohsOllila commented 3 years ago

The open discussions in this issue are continuing in the Databank repo (https://github.com/NMRLipids/Databank) in issues https://github.com/NMRLipids/Databank/issues/4 and https://github.com/NMRLipids/Databank/issues/2

Therefore, I close this issue now.