Simulation explodes during minimization when using Sage (openff-2.0.0.offxml) at intermediate lambda values for ethanol as solvent

pablo-osmo commented 1 year ago

Description I am trying to use openFF forcefields (Sage) to parameterize a molecular dynamics calculation of free energy of solvation of hydrofluorocarbons in ethanol through Gromacs. For di-,tri-,and mono-fluoromethane solvated in ethanol, my simulations are not able to minimize at lambda points with coulomb set to 0 and LJ set to 0.3-1.0. The simulation ends in a "Segmentation fault (core dumped)", and the last step in the log shows an infinite pressure. The simulations work with lower values of LJ and also when coulomb (and full LJ) is present.

Has Sage been parameterized to work with alchemical free energy calculations in non-water solvents? Does this seem to be an issue with my simulation parameters or the forcefield itself? I have attached all the files required to run the minimization as well as the .mdr file which contains all my simulation parameters.

To reproduce this minimization error: $GMX/gmx grompp -f min_6.mdp -c solvated.gro -p solvated_top.top -o min6.tpr

$GMX/gmx mdrun -deffnm min6

Please let me know if any additional information is needed to assist. Thanks!

EM.zip

mattwthompson commented 1 year ago

At first glance I'm not quite sure which direction to take in diagnosing this. On one hand, if the simulations run with without scaling parameters, Interchange is probably writing accurate parameters here. But if there are crashes, it could be either due to a parameter being written incorrectly, the force field not handling it well, the system not being equilibrated, too long of a timestep, etc. etc. Could you share the scripts you used to generate these files, or describe more about your workflow that produces these?

Has Sage been parameterized to work with alchemical free energy calculations in non-water solvents?

I don't think anybody's run these calculations, but Sage was trained in large part to physical property data (density and heat of vaporization, I think) of bulk fluids like alcohols. I'd speculate it should be able to handle these systems, but that's just a guess. There are some known issues with a few rarer chemistries that were fixed in version 2.1 - I assume you're using 2.0 here.

Have you run these sorts of calculations with other force fields (GAFF/OPLS/etc.) to better or more stable results?

pablo-osmo commented 1 year ago

Thank you for the prompt response. I was able to use my same pipeline to calculate the free energy of solvation of methane in water successfully (following http://www.mdtutorials.com/gmx/free_energy/index.html, but with parameters from interchange's Sage 2.0), so that is why I was wondering if the switch to methanol as the solvent was the cause of error. That's why I asked about if Sage 2.0 should be able to work with intermediate lambda values for non-water solvents. It is a good idea to check this same set of simulations with GAFF/OPLS or other ff available through interchange first. I will do this and let you know the results. I will also look into if heat of vaporization was calculated for non-water with alchemical free energy integration to see if Sage should work with intermediate lambda values.

To explain the workflow to generate the files above: I use interchange to parameterize methanol and chf3 as single molecules from openff.toolkit.Molecule.from_smiles(), generate a conformer, then use interchange.to_gro() and interchange.to_top(). I then use packmol to create a large box of only methanol, minimize and equilibrate through 500 ps of npt using the C-rescale barostat. (simulation parameters attached below in GROMACS format). Then, I use gmx solvate to place a single chf3 inside the methanol box, and then minimize to hopefully do 500 ps of npt. This is where I am seeing the infinite pressures with reduced LJ on Sage 2.0. Note, all the simulations I ran with Sage 2.0 and full LJ with 0-1.0 times the coulomb interactions did not explode, so the pipeline seems to work generally, including when LJ and Coulomb are at their "full" values (so a normal simulation without any alchemical reduction of the forcefield). All timesteps were 2 fs with covalent bonds containing hydrogen constrained.

Since the simulation was failing at minimization once I placed the chf3 inside the "equilibrated" methanol box, there is no timestep when it explodes, and the pure methanol box seems pretty well equilibrated (figure attached, noting the experimental density of methanol is 792 kg/m³). I have about 750 methanol molecules and 1 chf3 molecule.

manual_npt_equil_mdp.txt ac41ece0-266c-4720-adbc-164356da7dfe

mattwthompson commented 1 year ago

Thanks for all of this detail - it's encouraging that your pipeline works for methane-in-water and for most of the lambda values in your halocarbon-in-ethanol case.

I'm still not sure where the issue is, so I've reached out to our force field scientists who may have more insights on either the expected behavior or if there's a different way to set up this free energy calculation

pablo-osmo commented 1 year ago

**I realized I needed to do: I found that I needed to do conda install -c conda-forge -c omnia foyer. and then

forcefield = FoyerForceField('/home/pablo/miniconda3/lib/python3.10/site-packages/foyer/forcefields/xml/oplsaa.xml') interchange = Interchange.from_foyer(topology=Topology.from_molecules([mol]), force_field=forcefield, box=cubic_box, sysname=system_name) interchange["vdW"].mixing_rule = 'lorentz-berthelot' interchange.positions = mol.conformers[0]

Okay, I am trying to use interchange to get the OPLS parameters for my system, and am getting that the oplsaa.offxml does not exist (full error below). I am using openff-interchange 0.3.6 :

I am trying to run:

mol = Molecule.from_smiles('C(F)(F)F', allow_undefined_stereo=True) mol.generate_conformers() mol.name = res_name forcefield = ForceField('oplsaa') print(forcefield) cubic_box = unit.Quantity(boxsize * np.eye(3), unit.angstrom)

if 'openff' in forcefield: interchange = Interchange.from_smirnoff(topology=[mol], force_field=forcefield, box=cubic_box, sysname=system_name) elif 'opls' in forcefield: interchange = Interchange.from_foyer(topology=[mol], force_field=forcefield, box=cubic_box, sysname=system_name)

My smirnoff99frosst, amber_ff_porst, and openforcefields/offxml do not have any opls files inside of them even though I am copying: https://github.com/openforcefield/openff-interchange/blob/0b9046564cf06299af55783864395e73ff0b5e80/docs/using/migrating.md?plain=1#L91) . How would I be able to get the oplsaa.offxml?

Traceback (most recent call last): File "/home/pablo/proj/hfc_opls/make_boxes.py", line 12, in gro_utils.get_openff_gro_files(cur_mol['smiles'], cur_mol['resname'], cur_mol['molname'], forcefield='oplsaa') File "/home/pablo/repos/utils/gro_utils.py", line 64, in get_openff_gro_files forcefield = ForceField(forcefield) File "/home/pablo/miniconda3/lib/python3.10/site-packages/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 334, in init self.parse_sources(sources, allow_cosmetic_attributes=allow_cosmetic_attributes) File "/home/pablo/miniconda3/lib/python3.10/site-packages/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 796, in parse_sources smirnoff_data = self.parse_smirnoff_from_source(source) File "/home/pablo/miniconda3/lib/python3.10/site-packages/openff/toolkit/typing/engines/smirnoff/forcefield.py", line 1055, in parse_smirnoff_from_source raise IOError(msg) OSError: Source 'oplsaa' could not be read. If this is a file, ensure that the path is correct. Looked in the following paths and found no files named 'oplsaa': /home/pablo/proj/hfc_opls /home/pablo/miniconda3/lib/python3.10/site-packages/smirnoff99frosst/offxml /home/pablo/miniconda3/lib/python3.10/site-packages/openff/amber_ff_ports/offxml /home/pablo/miniconda3/lib/python3.10/site-packages/openforcefields/offxml If 'oplsaa' is present as a file, ensure it is in a known SMIRNOFF encoding. Valid formats are: ['XML'] Parsing failed while trying to parse source as a file with the following exception and message: <class 'openff.toolkit.utils.exceptions.SMIRNOFFParseError'> syntax error: line 1, column 0

mattwthompson commented 1 year ago

There is no SMIRNOFF port of OPLS-AA, which uses typing logic that is not compatible with SMIRNOFF. There are probably other ways to access it, the only supported route in Interchange is via Foyer and Interchange.from_foyer. Note the difference between .offxml (SMIRNOFF format) and .xml (any XML file, commonly-used by OpenMM).

davidlmobley commented 1 year ago

I think I found your problem, so for the TL;DR version skip to the bottom. For the narrative, read the full thing.

As a note, what you're doing sounds rather similar to solvation free energy calculations we've done with OpenFF in OpenMM in parameterization of Sage, without problem (specifically, solvation free energies for chloro/florocarbons in various mixtures including ethanol). Additionally, my group routinely uses OpenFF in free energy calculations in GROMACS including for fluorinated/halogenated compounds, so I don't see what the problem would be/why there would be a problem.

I find it very concerning that you can't even energy minimize; in past experience, I can get almost every system I ever set up to energy minimize in GROMACS, even if something is very wrong. So I'd very much suspect you have something wrong with your system preparation or coordinates, rather than the force field. How are you generating starting coordinates? Is there any chance you could be jumbling up your coordinate list so that when you go to simulate, teh atoms which are supposed to be bonded to one another are in fact very distant from one another?

If there were a force field problem, I would expect you would be able to minimize and then you'd get a crash when you started to run dynamics. But still, these paths systems are more or less well-studied for us and if there were problems with our force fields for these in general we'd already know about it, I expect. And... I also can't even figure out how to answer the question "If I wanted to break the forcefield in a way to CAUSE these kinds of crashes, what would I do?" If you have Coulomb interactions already turned off, there should be no strong charge-charge interactions that could cause electrostatics-induced collapse. You'll have LJ, obviously, but as long as you're using appropriate soft core potentials there should be no chance of this causing collapse either.

TL/DR version:

I had a quick look at your mdp file and found this in the free energy section:

couple-lambda0 = vdw-q ; only van der Waals interactions couple-lambda1 = none ; turn off everything, in this case only vdW\

My reading of this is that your starting state has full vdw and charge interactions, as described by "vdw-q" and your final state has no interactions, as defined by "none", so you're doing a charge AND vdW transformation, despite the fact that your comments say "only van der Waals interactions" and "turn of everything, in this case only vdW". So what you're seeing is likely conventional electrostatics-induced collapse where two charge-bearing atoms end up sitting on top of one another and causing an infinity, resulting in a crash and core dump.

Marking this closed, but reopen if you think there's actually a force field or interchange issue here.

pablo-osmo commented 1 year ago

Thank you for the clear response and clarification! I was able to make the simulations work fully with this help. Turns out it is NOT a Sage problem or interchange problem.

I found out the error was that while my actual (from my coul_lambdas and vdw_lambdas) STATE0 contained 0 LJ and 0 coulomb, and my STATE1 contained full LJ and coulomb, my couple-lambda0/1 were switched (I had STATE0 with full vdw-q and STATE1 with none) as you noted above! I guess that somehow allowed coulomb to be present when there were no LJ, even though my coul_lambdas and vdw_lambdas were all correct.

Interestingly, I copied these scripts from Professor Lemkul's GROMACS tutorials, where I saw this same swap of couple-lambda0/1 http://www.mdtutorials.com/gmx/free_energy/index.html. Since the tutorial only performs the LJ portion of the free energy integration, I guess this flip of couple-lambda0 and couple-lambda1 doesn't have any issues with charge present without full LJ since there is no charge! Once I expanded the mdp file to include coulomb lambda points as well, the error emerged.

davidlmobley commented 1 year ago

You may want to raise an issue on the Lemkul tutorials issue tracker if something needs clarifying/fixing there, then.

openforcefield / openff-interchange

Simulation explodes during minimization when using Sage (openff-2.0.0.offxml) at intermediate lambda values for ethanol as solvent #761

TL/DR version: