Closed j-wags closed 3 years ago
I like the way you're thinking here. I have no other insights at this point though.
Linking https://github.com/openforcefield/openff-forcefields/issues/34 -- Basically, rounding errors are being noticed in production workflows. I suggest we prioritize resolving this.
following the issue, and looking forward to a solution! I am getting some non-integer charges with am1bcc, for instance, 0.003.
@jchodera @mattwthompson @davidlmobley @SimonBoothroyd Any objection to implementing something like this option:
kwarg to create_openmm_system: do_not_round_charges=[molecules] or custom_charge_rounding={molecule:precision}
The one downside I see here is that it would put some aspect of FF application outside of the force field (instead, the use of non-default rounding precisions would only be stored in the parameterization code).
Would this raise a blocking objection for any of you?
ParmEd implemented the approach I suggested (in other words, I'm partly responsible for what ParmEd does) but it was suggested with only small molecules in mind.
Still, it's clear to me that we want systems with integer charges for essentially all applications, so we need something at least SIMILAR to what ParmEd does. Specifically, if we do NOT have integer charges, the consequences for free energy calculations are potentially dire -- e.g. one perturbs a system with a zero formal charge (but "real" formal charge of +0.003) to one that also has zero charge (but real charge of -0.002) and picks up a very significant free energy change from the associated artifact.
Big picture: We really, really want systems which are neutral for good performance and ought to do this well by default (by distributing charge a bit ourselves) and if users need a rope to hang themsleves with we can provide one but only with ample warnings.
No blocking objection from me! My main question is if we'd want this to be on or off by default. I have no strong feeling on the trade-offs.
I agree that the <Electrostatics>
tag is not the place for this, but on the basis that it is purely an implementation detail borne by the behavior of external libraries, not something that a scientist would use to design a force field.
I can't see any way to avoid implementing this in create_openmm_system
in order to resolve this issue this issue in a timely basis. That doesn't prevent something like fix_charges
from happening downstream, and it may be useful in some cases to allow it to happen later instead of or addition to inside of ElectrostaticsHandler
. I had some scattered thoughts about which function it should live inside of - create_openmm_system
or each handler?) but I'll leave that to the implementation.
As a half-measure, prior to biopolymer support, this implementation could set an upper limit on how much residual charge can be chucked onto an arbitrary atom, i.e. error out if it's trying to modify the first atom's charge by > 0.05. This would allow the 0.003 case to be rounded away but should safeguard against most invalid physics.
Overcharging one atom on a biopolymer sounds like a good "yeah, let's make sure we support this case" for designing the biopolymer infrastructure. Are there other similar cases to consider?
Picky: I would prefer something like round_charges= ...
over custom_charge_rounding= ...
since (correct me if I'm wrong) there is not a default per-molecule rounding that is applied if the argument is not specified.
The one downside I see here is that it would put some aspect of FF application outside of the force field (instead, the use of non-default rounding precisions would only be stored in the parameterization code).
I think that train has already left the station - charge_from_molecules
and partial_bond_orders_from_molecules
exist and are currently in conflict with this idea. I do think we should try as much as possible to keep the details in the force field where possible, but I don't think that's a good solution for this problem. Would the design here would have charge rounding off by default?
Specifically, if we do NOT have integer charges, the consequences for free energy calculations are potentially dire
I hadn't thought about that -- Great point.
My main question is if we'd want this to be on or off by default
Given Mobley's answer above, I think the default should be "on".
I can't see any way to avoid implementing this in create_openmm_system
I'm not sure I'd like it in create_openmm_system
, because then the logic couldn't discriminate between charge sources. I kinda think that a more natural place for the implementation would seem to be in either ToolkitAM1BCCHandler
or ElectrostaticsHandler
. Both of those will have access to the context for where the numbers are coming from. I'd approve a PR with it in either of those places.
Generally, I don't like the "put all the leftover charge on the first atom" scheme. I'd prefer to spread it out over all atoms at some high precision. That way, people don't hit the rounded-charge-accumulation problem if they go straight to OpenMM, and if folks export to another format, the exporting library can do its own rounding then.
I do think we should try as much as possible to keep the details in the force field where possible, but I don't think that's a good solution for this problem.
One other variable is that we can change the meaning of ToolkitAM1BCC
in the SMIRNOFF spec to record the rounding method. We'll probably need to add some new fields to the tag to specify how we handle ELF settings anyway for that series of new features (eg #793). So we could add that on to the changes for a 0.4 ToolkitAM1BCC
tag.
That way, people don't hit the rounded-charge-accumulation problem if they go straight to OpenMM, and if folks export to another format, the exporting library can do its own rounding then.
Perfect - this sounds like the best way to keep the handlers close to ground truth while maintaining compatibility with engines where necessary.
This was noted by a user again today. This should be prioritized for the next release.
Is your feature request related to a problem? Please describe.
Both of these issues can lead to rounding errors that become significant in larger molecules.
Describe the solution you'd like ParmEd implemented this approach, which rounds existing charges to the desired precision, and then adds the sum of the rounding errors to the first particle to achieve the correct net charge on each residue. This may be tricky for OFFTK, since it has no concept of "Residues", thus a large biopolymer could require a substantial correction, which should not be dumped on a first atom (for example, if 1000 residues each contributed 0.001e of rounding error, then the first atom might end up getting +1 added to its charge).
It seems likely that we could apply the same solution, but just with a slightly more complex algorithm to spread out the final charge correction over several atoms. It's not clear to me that this is a question with an "automagic" solution -- pathological cases could lead us to introducing errors with this functionality (for example, accumulating a large error given a biopolymer, or adding a dipole to a symmetric system).
We'd want to figure out how this would interface with our current functionality. Basically, we should allow
Further, we should figure out where this would live in our current API. Some possibilities include
do_not_round_charges=[molecules]
orcustom_charge_rounding={molecule:precision}
ElectrostaticsHandler
(but this prevents us from handling different molecules individually). So, for example,<Electrostatics version="0.3" method="PME" scale12="0.0" scale13="0.0" scale14="0.833333" cutoff="9.0 * angstrom" rounding_precision="4"/>
System
class, which would know which simulation format it is exporting to, and thus which precision to use.I'm most in favor of the first and third options, since the second would require a change to the SMIRNOFF spec and disallow treating molecules differently. The first option is the only one that is immediately available.
Additional context In the LibraryCharges PR that was already merged (#433), we added a check that the sum of a molecule's partial charges must be within 0.01 of the molecule's formal charge. This can be overridden by a kwarg (
allow_nonintegral_charges=True
).