michellab / Sire

Sire Molecular Simulations Framework
http://siremol.org
GNU General Public License v3.0
95 stars 26 forks source link

Protein Unfolding in SOMD #353

Closed fjclark closed 2 years ago

fjclark commented 2 years ago

Hello,

I've been attempting to use SOMD for an ABFE calculation of MIF180 binding to macrophage inhibitory factor (MIF). MIF is a homotrimer composed of three chains, and MIF180 binds close to the interface of two of these. However, one chain of MIF consistently unfolds early on during 1 ns runs with SOMD at lambda = 0, but the same system stays folded when using pmemd.cuda (using the BSS.protocol.production defaults).

image

I have uploaded the files required to reproduce this issue (and the output) here https://github.com/fjclark/somd_unfolding_issue. The ligand is named 3TX in the input files. The calculation was run by calling:

somd-freenrg -C ../../input/sim.cfg -l 0.000 -p CUDA

From output/single_cycle.

I'm using ff14sb for MIF (with custom parameters for the neutral terminal prolines) and GAFF for MIF180. The system has been equilibrated in the NPT ensemble using pmemd.cuda, again using BioSimSpace defaults.

This image shows the input system (left) and the system from the first frame of the trajectory (right). image

The unfolding of the chain (see bottom right of images) seems to already be visible in the first frame of the trajectory.

@lohedges and @JenkeScheen I'd be very grateful for any advice.

Thanks very much, Finlay

lohedges commented 2 years ago

Hi Finlay,

Just to check, this is exhibiting unfolding within a single SOMD cycle? We have recently found and fixed an issue that did lead to protein unfolding with SOMD, but this was for multi-cycle simulations. (Commit is here.) The problem was that a random number seed was being used for debugging SOMD trajectories, but this led to the same seed being used for the thermostat and barostat for each SOMD cycle which led to unfolding with the BioSimSpace defaults at the time, i.e. many short cycles.

Could you confirm that you are using the latest development version of Sire?

When I get a moment I'll run your inputs locally to see if it also unfolds for me.

Cheers.

fjclark commented 2 years ago

Hi Lester,

Apologies, I forgot to specify: this is within a single cycle, and I am using the latest (2021.1.0) version of Sire.

Thanks.

lohedges commented 2 years ago

No problem, this was obvious once I looked at your input files. Just running in the background now and will let you know when it's finished.

fjclark commented 2 years ago

Brilliant, thank you.

lohedges commented 2 years ago

Just finished running and I see the same thing here. I'll try re-running with vanilla SOMD to see if the issue is isolated to the FEP implementation. One thing to note is that you are using HMR, which I am unfamiliar with in SOMD. (It is off by default with BioSimSpace.) Perhaps @jmichel80 could comment, but I'm wondering if this could lead to any stability issues. (He might be able to eyeball the SOMD config file for any potential issues.) Are you using the same approach with PMEMD?

lohedges commented 2 years ago

Ah, one thing I've noted is that the ligand specified in MORHP.discharge.pert is the third molecule in SYSTEM.top. By default, this should be the first entry in the top file. (At least, this is what I was told when writing the SOMD code in BioSimSpace.) There is a configuration option that can change this:

perturbed residue number = 1
The residue number of the molecule to morph.
lohedges commented 2 years ago

I'm trying again with a residue number of 343.

fjclark commented 2 years ago

I didn't use HMR with PMEMD and I did wonder if this might be causing issues. However, I repeated a calculation without HMR (and with a 2 fs timestep) and saw the same behaviour (although I used several cycles).

Ah ok, thanks for pointing that out.

jmichel80 commented 2 years ago

hi @fjclark it would be helpful if you can update your repo with the code used to generate the input files SYSTEM.top and SYSTEM.crd in case the issue is caused by the way the system components have been assembled.

jmichel80 commented 2 years ago

Another thing to check - I am surprised there are no counter-ions, is the net-charge of the system +0 ? PMEMD must have applied a background neutralising charge, but SOMD's reaction field treatment wouldn't.

fjclark commented 2 years ago

Hi @jmichel80 yes, the net charge of the system is +0. I will update my repo.

lohedges commented 2 years ago

I see no unfolding when I set perturbed residue number = 343 in sim.cfg. I think you were incorrectly applying the perturbation to the first MIF chain. Feel free to test at your end and let me know how you get on.

import BioSimSpace as BSS

s = BSS.IO.readMolecules(["SYSTEM.crd", "SYSTEM.top"])

print(s.search("resnum 343")[0])
<BioSimSpace.Residue: name='3TX', molecule=4, index=0, nAtoms=34>
lohedges commented 2 years ago

Although, if this is the case, it would have been nice if SOMD errored when the name of the perturbed residue number doesn't match that specified in the MORPH.pert file.

jmichel80 commented 2 years ago

Yes that's odd, I think it should have done that. See

https://github.com/michellab/Sire/blob/devel/corelib/src/libs/SireIO/perturbationslibrary.cpp#L1167-L1171

Still hoping to find time to actually run the code.

lohedges commented 2 years ago

I'm not sure, but it's definitely wrong. If I set the residue number to 343 then the OpenMMD.py file and PerturbattionsLibrary code report the name as 3TX, as expected. If I use the orginal sim.cfg, then I get PRT in both cases. PRT is indeed the first residue in the system, but MORPH.discharge.pert that is specified in sim.cfg uses 3TX as the molecule. For reference, the MORPH.prt.pert file does use PRT as the molecule name.

lohedges commented 2 years ago

Ah, the MORPH.discharge.pert contains two molecules, and I was only looking at the first. The ligand, 3TX, is at the top, followed by PRT below. This explains the PRT match, even though it (I assume) isn't the intended molecule. Maybe I'm missing how ABFE works with SOMD, but it looks like something has got messed up in the setup stage.

fjclark commented 2 years ago

OK thanks - I have tested at my end and MIF stays folded. This issue was my misunderstanding of how SOMD treats pert files:

SOMD did produce an error when I tried to run initially:

"There is no perturbations template for the molecule with name PRT available templates are [ LIG ]."

But I misunderstood how the pert files were applied - I assumed SOMD assumed that molecules with unrecognised names were ligands to be perturbed, and searched for them in the pert file. Because the first residue is non-standard (neutral terminal proline), I assumed that the error was because SOMD didn't recognise the name, so I added entries for PRT to the end of the pert file (where parameters were kept constant).

Thanks very much for your help and apologies for wasting your time with this.

lohedges commented 2 years ago

No problem. A lot of this stuff isn't fully documented and is completely hidden from the user if you use other tools to set up SOMD simulations, e.g. BioSimSpace. (Of course that doesn't do ABFE at present, though.)

jmichel80 commented 2 years ago

Not at all @fjclark , the software is flaky as many piece of academic code and it's very instructive to see what you did to go past the error message. These are valuable lessons to harden SOMD for ABFE calculations with a BioSimSpace interface. Thanks @lohedges for spotting the problem !