selimsami / qforce

Apache License 2.0
57 stars 13 forks source link

Terminal hydrogens replaced with oxygen in GROMACS MD runs #70

Open mattiafelice-palermo opened 5 months ago

mattiafelice-palermo commented 5 months ago

Hello Selim,

I hope you had nice new year holidays!

I'm writing to report a possible bug, which I suspect it's preventing me from completing a parametrization for an agarose dimer. I have generated the PES for all the required dihedral (16 terms) using the fragment approach. The PESs I have generated should be very accurate due to a particular workflow I have developed built on a combination of CREST + TorsionDrive (more on that later).

Unfortunately, the fitting procedure returns very poor results for several of the dihedrals. To troubleshoot this, I've given a look at the GROMACS minimization folders generated by Q-Force, and noticed something strange. All of the structures of the fragments have been modified and now feature a variable number of Oxygens connected to a Carbon with single bond in place of the Hydrogens present in the original structure. Here is an example from one of the fragments:

image

I suspect this leads to incorrect atom types and therefore the failure of the various dihedral parametrization effort. I wanted to ask for your support in troubleshooting this issue. Maybe I'm doing something wrong in the process and I'm not aware of it. I'm attaching a Minimum Reproducible Example (MRE.zip) for further investigation.

If the data (PESs, structures, MD folders, etc) for all the dihedrals are needed for troubleshooting, I can of course share it.

Also, I would like to share with you the results of a workflow I have devised that combines CREST and TorsionDrive to perform accurate dihedral PES even in challenging cases like the agarose molecule. Maybe it could be an interesting feature to add to Q-Force, although I should mention that the approach is quite computationally demanding, and would require adding several dependencies. Nevertheless, if you're interested I can share some results by email from my experimentation.

Thank you as always!

Mattia

mattiafelice-palermo commented 5 months ago

And update that I think it supports my hypothesis on the ongoing issue. In the following document you can find the PES profiles, the fits and a depiction of the geometries found in the folders used to run GROMACS minimization by Q-Force. In the depictions I've highlighted in blue the atoms that are part of the dihedral being scanned, and drawn an orange area around the unexpected Oxygen/Carbon atoms. It can be seen that if those "weird" atoms are close to the dihedral being scanned, then the fit is extremely noisy. On the other hand, dihedral rotations that are not influenced by those atoms (e.g. bridge between the two heterocycles, or hydroxyls far away from the unexpected atoms) return a nearly perfect fit. I hope this helps!

Diagaroses dihedral fit.pdf

selimsami commented 5 months ago

Hi Mattia,

Happy new year and thanks for the detailed report.

If you look closely at the GROMACS .itp file, you will see that while indeed the capping atoms still have the original atom names (which is what VMD uses to visualize), they do have updated atom types, for example if you look at those oxygens, you will see that they all have atom type "opls140" which is an aliphatic hydrogen.

The issue you're having is something we and other users have reported in the past though. It's not a bug, but it's a deficiency/difficulty in relaxed scans. Think of it like this - we're using all the fancy algorithms like torsiondrive (and now you use CREST) for the QM scans - but what about the MM scans? they're still basic relaxed scans - so 1) the comparison becomes unfair; 2) MM scans will still suffer the negative effects of having a single relaxed scan.

We are actually very close now to a big update where; 1) you will be able use torsiondrive for MM scans as well and 2) you will be able to combine torsiondrive-xTB with single point QM calculations afterwards to obtain good energies and fragment atomic charges. I except this to be ready in the next few weeks and there's a good chance that it might help with your issue.

There's still the additional problem with molecules like yours the very strong intra-molecular hydrogen bonding. One part of the problem is the jumps these bonds introduces in relaxed scans - which should be fixed by the above method, but the other issue is the inaccuracy of these FFs to describe intra-molecular h-bonds, which then creeps into the dihedral fitting - this is beyond the scope of qforce though, at least now. But eventually it might make sense to re-parameterize also the intramolecular non-bonded LJ parameters. Which could be even more possible with all the different configurations we compute and can use to fit with addition of torsiondrive and possibly CREST. But this would be a big project!

Regarding using CREST with qforce, I think this could be an interesting addition. I would be very interested to see how your QM dihedral scans have changed using torsiondrive with and without CREST for agarose. So please do email me those if you get the chance and we can discuss further there.

Cheers, Selim

mattiafelice-palermo commented 5 months ago

Hello Selim,

thank you for the quick and thorough response! I apologize I didn't catch the previous issue #35 before posting, I read the closed issue a while ago and for some reason my mind didn't register it. Also, thank you for the updates on the new functionalities: the addition of the new features sounds great, I'm very curious to test them out!

Regarding the difficulty in obtaining the PES from MD, I understand your points. Nevertheless I didn't expect this to lead to errors of tens of kcal/mols as in this profile for the fragment CO_H12C6O5_15e9469f81604f8a2e4a5949218fc658~1:

In regards to obtaining the correct MM PES, I was wondering if you have experimented with the GROMACS COLVARS module? The PMF approach is straightforward and typically gives extremely good (albeit relatively expensive) PES for MM. For example, I have done a few rounds of fitting on the same fragments from above (with OPLS parameters and the studied dihedral turned off in the first scan) where I probed its PES in a box with Argon for improved thermalization and PES exploration:

As you can see, there are definitely no spikes or discontinuities. The only limit here would be the effectiveness of the iterative fitting process (which seems to be great in Q-Force) and my ability to do good fits, which can certainly be compensated by an automated procedure ;) I might be wrong, but I have the feeling that with this approach we could get very accurate potentials without spikes and discontinuities.

If you think this could be a good approach, I would be interested in discussing it further and help with the implementation. Maybe this could help in figuring out if we are looking just at the poor performance of the current relaxed scans or, as you mentioned, also at other factors such as the poor modeling of intramolecular hydrogen bonding and other inaccuracies of the force field.

Regarding the CREST+TorsionDrive effectiveness vs TorsionDrive, I will send you the material at your Berkeley email address :) I have performed the "vanilla" TorsionDrive only on one dihedral and the results were quite definitive, but as you mentioned I will also do it on the other remaining 15 to have the complete picture.

Mattia

selimsami commented 5 months ago

Your results look interesting, thanks for that. I am bit puzzled why the fits are so bad though.

I have a feeling that the MM torsiondrive will already get rid of these spikes that you have. So perhaps it would be beneficial to wait a few weeks till the update is ready, see if this improves your situation, then we can talk about whether additional steps are needed. Otherwise we might be we working on two different solutions to the same problem. And torsiondrive definitely seems like the computationally efficient option.

Addition of CREST to QM (or MM) scans sounds interesting and is worth testing regardless. So let's indeed talk about that further.

mattiafelice-palermo commented 5 months ago

Sounds like a wise plan! We can take advantage of manually executed ABF runs as a method to check whether the MM torsiondrive has indeed converged to the lowest energy PES.

For reference, I have completed the fit for the dihedral above, and it is indeed possible to achieve a good fit with more iterations, considering the other parameters are not tailored to agarose:

If you think this agarose molecule could be a good test bench for the new MM Torsiondrive, I'll be happy to beta test or share with you the data.

I'll reach out when the full set of Torsiondrive scans are completed. Thanks for the feedback :)

Mattia

selimsami commented 5 months ago

That does look a lot better!

Just curious, there's still some kinks in the QM scan between -60 to -10 degree range. Is that an artifact or is there a reason for that based on your molecule's symmetry?

Agarose does sound like a good stress test for the new MM torsiondrive. Would be good to test it there. Will update you when it's ready.

mattiafelice-palermo commented 5 months ago

By inspecting the molecule, it seems that upon the rotation of the hydroxyl, there is a rapid succession of different "low energy" conformations of the galactose unit in that dihedral window. In particular, going from left to right, it goes chair_1 --> boat --> chair_2 --> chair_1. I'm attaching the geometries corresponding to the scan, aligned to the four central atoms of heterocycle in order to facilitate the visualization of the chair/boat conformations.

I think that, due to the asymmetric geometry of the molecule, depending on the rotation angle of the scanned hydroxyl, the hydrogen network may tip the balance towards the stability of one or the other of these conformations. I imagine the various "quirks" in that area as a superposition of different potential wells corresponding to different correlated dihedrals, with very similar minima - just a hunch!

It's indeed quite an interesting molecule! scan_geometries.zip

selimsami commented 5 months ago

Thanks for the explanation. Everything looks extremely coupled there! I wonder if there exists a dihedral potential that will give you a reasonable PES or whether you'd have to start considering dihedral coupling terms, 2-dimensional CMAP term, or reparametrized LJ parameters. I guess it depends also on what are your accuracy requirements.

mattiafelice-palermo commented 5 months ago

Yes, I was thinking the same. At least a quick 2D TorsionDrive screening with xtb to check if there's any correlation between the different dihedral pairs could give me an idea of whether a more accurate scan is required :)

Talk to you as soon as the TorsionDrive scans are ready!