Closed xiki-tempula closed 1 year ago
This is a result of two things. Firstly the triclinic lattice reduction used to ensure that the tricliic box vectors satisfy the constraints of the engine, as specified here. Secondly, because resulting cell matrix is written to file in fixed precision, when read back in, you don't perfectly recover the original lattice vectors. For example, your box information is translated into the following within the GROMACS gro87 file:
4.00000 4.00000 2.82843 0.00000 0.00000 0.00000 0.00000 2.00000 2.00000
To fix this I guess we will apply an appropriate rounding on read.
I'm not sure whether to solve this in BioSimSpace, or add enforce this with the PeriodicBox
and TriclinicBox
classes within Sire. @chryswoods: Do you have any preference? I'll transfer this over to Sire as necessary.
I am a little nervous of being "clever" and fixing this in PeriodicBox
and TriclinicBox
directly. This rounding is only an issue because the file has been read from a fixed-format file. I would be worried applying rounding to a values read from a binary-format file which were exactly as they had been written by the program. This would trigger similar errors if file we wrote with rounded dimensions were mixed with files written without rounding.
I think you are right that this would only be the code in TriclinicBox
that calculated the lattice vectors/angles based on input, and we could round those calculated values. This is likely to be safe if we detect "standard" boxes, e.g. 120 degrees etc.
Thinking as I am writing - yes, transfer it to sire and I think the fix is rounding when lattice vectors/angles are calculated within TriclinicBox
. There won't be a need to do anything in PeriodicBox
, as this only has box dimensions and no angles.
Thanks, I agree with what you say. I'd just like to understand/debug the issue posted here a little further before we fix anything in the TriclinicBox
class since, given the same input, you should get the same output once the lattice reduction has been applied. I am seeing this locally, i.e. the values are different to those in the input file (as I would expect) but they are always the same on repeated reads and writes. This doesn't seem to be the case for the replica setup, so I'd like to understand why.
Okay, I understand the problem in the other issue thread. I'll transfer to Sire and am happy to work with you to fix the issue. My main concern is making sure that whatever fix we apply still generates a valid box for the other engines that we support, i.e. we don't just arbitrarily enforce certain box angles if this would break the internal box constraints used by, e.g. GROMACS.
Thanks @lohedges and @chryswoods, it would be great if we could also have some temporary quick workaround in the current version of Sire or BSS as well (by "current" I mean 2023.1.x), as this bug means we can't realiably run dodecahedral REMD in our current release and somehow that didn't use to be an issue before (not sure why).
Yes, it's strange that the issue hadn't occurred before. I'm not sure how you are setting things up, so perhaps this isn't possible if the replicas are initialised from different Python processes but you could do something like the following to set the box angles to the same value for all replicas:
# Here replicas is a list of the systems, one for each replica.
# Get the box dimensions and angles for the first replica.
box0, angles0 = replicas[0].getBox()
# Apply the same angles to each other replica, keeping the box dimensions.
for replica in replicas[1:]
replica.setBox(replica.getBox()[0], angles0)
For example, using the input that @xiki-tempula posted in this thread:
import BioSimSpace as BSS
mol_0 = BSS.IO.readMolecules(['lambda_0.0000/amber.prm7', 'lambda_0.0000/amber.crd'])
mol_1 = BSS.IO.readMolecules(['lambda_1.0000/amber.prm7', 'lambda_1.0000/amber.crd'])
# Initial boxes.
mol_0.getBox()
([81.5793 A, 81.5793 A, 81.5793 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
mol_1.getBox()
([81.4219 A, 81.4219 A, 81.4218 A],
[120.0000 degrees, 120.0000 degrees, 90.0000 degrees])
# Update the angles for the second molecule (or first if you want it that way around)
mol_1.setBox(mol_1.getBox()[0], mol_0.getBox()[1])
# Updated box.
mol_1.getBox()
([81.4219 A, 81.4219 A, 81.4218 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
# Save.
BSS.IO.saveMolecules("test", mol_1, "rst7")
Now:
tail -n 1 test.rst7
81.4219024 81.4219024 81.4218459 59.9999771 59.9999771 90.0000000
Note that with the above approach it's still creating a new TriclinicBox
space when updating the angles, so the same lattice reduction is being performed. In this case it does look like the angles are preserved, but it might not be true in general.
So I have two amber RST7 file. amber_0.rst7: 120.0000229 120.0000229 90.0000000 amber_1.rst7: 120.0000229 120.0000229 90.0000000
So my understanding is that if you want to make sure that these two boxes will have the same angle when you write it out, you do:
mol_1 = BSS.IO.readMolecules(['amber.prm7', 'amber_0.rst7']) mol_2 = BSS.IO.readMolecules(['amber.prm7', 'amber_1.rst7']) mol_1.setBox(mol_1.getBox()[0], mol_2.getBox()[1]) mol_1.getBox() BSS.IO.saveMolecules('mol_new', mol_1, 'RST7') Archive 2.zip
Gives 59.9999771 59.9999771 90.0000000
Yes, in this case the angle is changed, but when saving both molecules you get the same result in the output file:
In [1]: import BioSimSpace as BSS
In [2] mol_1 = BSS.IO.readMolecules(['amber.prm7', 'amber_0.rst7'])
In [3]: mol_2 = BSS.IO.readMolecules(['amber.prm7', 'amber_1.rst7'])
In [4]: mol_1.getBox()
Out[4]:
([81.4219 A, 81.4219 A, 81.4218 A],
[120.0000 degrees, 120.0000 degrees, 90.0000 degrees])
In [5]: mol_2.getBox()
Out[5]:
([81.5793 A, 81.5793 A, 81.5793 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [6]: mol_1.setBox(mol_1.getBox()[0], mol_2.getBox()[1])
mol_1.getBox()
Out[7]:
([81.4219 A, 81.4219 A, 81.4218 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [8]: BSS.IO.saveMolecules("test1", mol_1, "rst7")
Out[8]: ['/home/lester/Downloads/Archive.2/test1.rst7']
In [9]: BSS.IO.saveMolecules("test2", mol_2, "rst7")
Out[9]: ['/home/lester/Downloads/Archive.2/test2.rst7']
Check the output:
tail -n1 test?.rst7
==> test1.rst7 <==
81.4219024 81.4219024 81.4218459 59.9999771 59.9999771 90.0000000
==> test2.rst7 <==
81.5793091 81.5793091 81.5792526 59.9999771 59.9999771 90.0000000
If you update the angles the other way round, i.e. apply mol_1
's angles to mol_2
, then you'd get:
==> test111.rst7 <==
81.4219024 81.4219024 81.4218459 120.0000229 120.0000229 90.0000000
==> test222.rst7 <==
81.5793091 81.5793091 81.5792526 120.0000229 120.0000229 90.0000000
Instead, if you just read then write the original molecules without editing the angles, then the output is different:
import BioSimpace as BSS
mol_1 = BSS.IO.readMolecules(['amber.prm7', 'amber_0.rst7'])
mol_2 = BSS.IO.readMolecules(['amber.prm7', 'amber_1.rst7'])
BSS.IO.saveMolecules("test11", mol_1, "rst7")
BSS.IO.saveMolecules("test22", mol_2, "rst7")
Again, check the output:
tail -n1 test??.rst7
==> test11.rst7 <==
81.4219024 81.4219024 81.4218459 120.0000229 120.0000229 90.0000000
==> test22.rst7 <==
81.5793091 81.5793091 81.5792526 59.9999771 59.9999771 90.0000000
So, this time the box angles are different.
All I'm saying is that I wouldn't want to guarantee that the above would be true for all input, since it is still creating a new TriclinicBox
space on editing the angle, so there could still be some transformation based on numerical precision. It might be possible to to tests the above with a wide range of input to see how robust it is, e.g. a bunch of boxes with the same angles, but different dimensions, i.e. due to NPT.
Just a note on this:
...by "current" I mean 2023.1.x
Our new release approach backports fixes to the previous release only, i.e. 2023.2 at present. The current version of Sire in main
is 2023.2.2, and for BioSimSpace it is 2023.2.1. It seems like this approach won't fully work for you, no? We don't really have the bandwidth to backport fixes to multiple versions, but might be able to apply specific fixes for limited platforms.
Having looked at things a little more, I think the angle update is preserved since everything is in full double precision. The issue only every occurs on read, due to the fixed-precision nature of the AMBER rst7 file.
Posting here for record. This is a numerical imprecision error caused by us not storing the lattice angles in the TriclinicBox
class. Instead, we store the lattice vectors. Conversion from lattice angles to lattice vectors causes a loss of precision. Then back-conversion from lattice vectors to lattice angles causes more loss of precision. Repeated loading and saving of TriclinicBox
using lattice angles causes instability, e.g. using the lattice angles of the example above;
In [1]: import sire as sr
In [2]: sr.use_new_api()
In [3]: box = sr.vol.TriclinicBox(1, 1, 1, 120.0000229*sr.units.degrees, 120.0000229*sr.units.degrees, 90.0000000*sr.units.degrees)
In [4]: for i in range(0, 10):
..: print(box.alpha(), box.beta(), box.gamma())
..: box = sr.vol.TriclinicBox(1, 1, 1,
..: box.alpha() * sr.units.degrees,
..: box.beta() * sr.units.degrees,
..: box.gamma() * sr.units.degrees)
..:
59.99999999999207 59.99999999999207 90.0
120.00000000000001 120.00000000000001 90.0
59.99999999999999 59.99999999999999 90.0
120.00000000000001 120.00000000000001 90.0
59.99999999999999 59.99999999999999 90.0
120.00000000000001 120.00000000000001 90.0
59.99999999999999 59.99999999999999 90.0
120.00000000000001 120.00000000000001 90.0
59.99999999999999 59.99999999999999 90.0
120.00000000000001 120.00000000000001 90.0
This is a bug, as while the two different sets of lattice angles are equivalent, it is definitely surprising that they are changing. We should find a way to store the original input lattice angles (in degrees) and make sure that we return these from the .alpha()
, .beta()
and .gamma()
functions. We could possibly also look to detect special case angles, e.g. multiples of 30 degrees.
While I see what you're saying, I don't think it's quite as simple as this. There are two ways to construct a TrinclicBox
. One uses the box dimensions and angles, so the angles are known as input. The other uses the box vectors, so the angles must be computed internally. It is the second approach that will be the common pathway since the initial box will likely be generated by GROMACS, so will be written as a cell matrix. If you use the above code with known integer angles, then the result is (unsurprisingly) stable:
import sire as sr
sr.use_new_api()
box = sr.vol.TriclinicBox(
1, 1, 1, 120 * sr.units.degrees, 120 * sr.units.degrees, 90 * sr.units.degrees
)
for i in range(0, 10):
print(box.alpha(), box.beta(), box.gamma())
box = sr.vol.TriclinicBox(
1,
1,
1,
box.alpha() * sr.units.degrees,
box.beta() * sr.units.degrees,
box.gamma() * sr.units.degrees,
)
Gives:
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
119.99999999999999 119.99999999999999 90.0
Which would be written as 120.0000000 120.0000000 90.0000000 in AMBER format, which is what we want. Interestingly, if you use the box vectors instead, then it is stable, but has still been rotated:
import sire as sr
sr.use_new_api()
box = sr.vol.TriclinicBox(
1,
1,
1,
120.0000229 * sr.units.degrees,
120.0000229 * sr.units.degrees,
90.0000000 * sr.units.degrees,
)
for i in range(0, 10):
print(box.alpha(), box.beta(), box.gamma())
box = sr.vol.TriclinicBox(box.vector0(), box.vector1(), box.vector2())
Gives:
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
59.99999999999207 59.99999999999207 90.0
To perform a solvation for a similar box we could do:
import BioSimSpace as BSS
# Load a test system.
system = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.crd", "ala.top"]))
# Generate the box dimensions and angles.
box, angles = BSS.Box.rhombicDodecahedronSquare(3*BSS.Units.Length.nanometer)
# Solvate the first molecule in this box.
solvated = BSS.Solvent.tip3p(system[0], box=box, angles=angles, work_dir="sol")
# Check the box.
solvated.getBox()
([30.0000 A, 30.0000 A, 30.0000 A],
[120.0000 degrees, 120.0000 degrees, 90.0000 degrees])
Looking at the sol/solvated.gro
file, you can see the (fixed-precision) triclinic cell matrix info written by gmx editconf
is this:
3.00000 3.00000 2.12132 0.00000 0.00000 0.00000 0.00000 1.50000 1.50000
This is where we begin to lose precision. Writing the solvated system to AMBER rst7 format gives:
BSS.IO.saveMolecules("test", solvated, ["rst7", "prm7"])
tail -n1 test.rst7
30.0000000 30.0000000 29.9999976 120.0000027 120.0000027 90.0000000
Reading back in and re-writing:
new_system = BSS.IO.readMolecules("test.*7")
new_system.getBox()
([30.0000 A, 30.0000 A, 30.0000 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
Now the angles have flipped. In this case the TriclinicBox
was generated using the angles, but the ones that are returned by the .alpha()
, .beta()
, and .gamma()
methods of the space give the angles after the (potential) rotation and lattice reduction are performed.
We do need to do the lattice reduction to get things to work consistently across engines and we also need to be careful about rounding since it can cause blow-ups in simulation. (This has occured previously in GPU simulations with OpenMM.)
I'll have a think about detecting standard box shapes and preventing the rotation if it only happens due to numerical precision. I'll also take another look at the lattice reduction code since it might be possible to make some simple modifications to make it more stable, i.e. not rotating if within some tolerance.
Oh, and as you say, it seems like there are two sources of numerical imprecision here (other than the fixed-format files), i.e. the conversion from box vectors to angles and the lattice reduction itself. And I definitely agree that it is undesirable that the box rotates as a result of this, and is something that we should look to fix.
You are right - it is so subtle! There's a lot of hidden complexity in this class, especially working across engines (which I'd forgotten to consider). It would help if the lattice angles that were put into the constructor were returned unaltered when the class is queried. Maybe there could be some logic that compares the input lattice angles to the calculated angles and checks if they have drifted too far away? It could be a sanity check to test for rotation or rounding errors? I know this is difficult as it is angles - I remember my pain when there was so much subtlety in the dihedral angle code...
You can get it to be stable and give the values that we want by using FE_TOWARDZERO
and rounding using std::nearbyint
instead of std::round
, i.e:
diff --git a/corelib/src/libs/SireVol/triclinicbox.cpp b/corelib/src/libs/SireVol/triclinicbox.cpp
index 8632d7d7..a603148e 100644
--- a/corelib/src/libs/SireVol/triclinicbox.cpp
+++ b/corelib/src/libs/SireVol/triclinicbox.cpp
@@ -29,6 +29,7 @@
#include <QList>
#include <QPair>
+#include <cfenv>
#include <cmath>
#include <limits>
@@ -121,6 +122,7 @@ TriclinicBox::TriclinicBox(double a, double b, double c, const SireUnits::Dimens
{
// Adapted from:
// https://github.com/rosswhitfield/ase/blob/master/ase/lattice/triclinic.py
+ std::fesetround(FE_TOWARDZERO);
auto cosa = cos(alpha.value());
auto cosb = cos(beta.value());
@@ -227,9 +229,9 @@ void TriclinicBox::construct(const Vector &v0, const Vector &v1, const Vector &v
*/
// Perform the reduction.
- auto new_v1 = this->v1 - std::round(this->v1.x() / this->v0.x()) * this->v0;
- auto tmp = this->v2 - std::round(this->v2.y() / this->v1.y()) * this->v1;
- auto new_v2 = tmp - std::round(tmp.x() / this->v0.x()) * this->v0;
+ auto new_v1 = this->v1 - std::nearbyint(this->v1.x() / this->v0.x()) * this->v0;
+ auto tmp = this->v2 - std::nearbyint(this->v2.y() / this->v1.y()) * this->v1;
+ auto new_v2 = tmp - std::nearbyint(tmp.x() / this->v0.x()) * this->v0;
Now I get:
import sire as sr
sr.use_new_api()
box = sr.vol.TriclinicBox(
1,
1,
1,
120.0000229 * sr.units.degrees,
120.0000229 * sr.units.degrees,
90.0000000 * sr.units.degrees,
)
for i in range(0, 20):
print(box.alpha(), box.beta(), box.gamma())
box = sr.vol.TriclinicBox(
1,
1,
1,
box.alpha() * sr.units.degrees,
box.beta() * sr.units.degrees,
box.gamma() * sr.units.degrees,
)
120.00002289999998 120.0000229 89.99999999999999
120.00002289999992 120.00002289999998 89.99999999999997
120.00002289999988 120.00002289999995 89.99999999999994
120.00002289999982 120.00002289999992 89.99999999999991
120.00002289999978 120.0000228999999 89.99999999999989
120.00002289999975 120.0000228999999 89.99999999999987
120.00002289999969 120.0000228999999 89.99999999999986
120.00002289999965 120.0000228999999 89.99999999999983
120.00002289999962 120.0000228999999 89.9999999999998
120.00002289999956 120.0000228999999 89.99999999999977
120.00002289999952 120.0000228999999 89.99999999999974
120.00002289999946 120.0000228999999 89.99999999999973
120.00002289999942 120.0000228999999 89.9999999999997
120.0000228999994 120.0000228999999 89.99999999999967
120.00002289999934 120.0000228999999 89.99999999999964
120.0000228999993 120.0000228999999 89.99999999999963
120.00002289999927 120.0000228999999 89.99999999999962
120.00002289999921 120.0000228999999 89.99999999999959
120.00002289999917 120.0000228999999 89.99999999999956
120.00002289999914 120.0000228999999 89.99999999999953
And:
import sire as sr
sr.use_new_api()
box = sr.vol.TriclinicBox(
1,
1,
1,
120.0000229 * sr.units.degrees,
120.0000229 * sr.units.degrees,
90.0000000 * sr.units.degrees,
)
for i in range(0, 20):
print(box.alpha(), box.beta(), box.gamma())
box = sr.vol.TriclinicBox(box.vector0(), box.vector1(), box.vector2())
120.00002289999998 120.0000229 89.99999999999999
120.00002289999998 120.0000229 89.99999999999999
120.00002289999998 120.0000229 89.99999999999999
120.00002289999998 120.0000229 89.99999999999999
120.00002289999995 120.0000229 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999995 120.00002289999998 89.99999999999999
120.00002289999992 120.00002289999998 89.99999999999999
120.00002289999992 120.00002289999998 89.99999999999999
120.00002289999992 120.00002289999995 89.99999999999999
120.00002289999992 120.00002289999995 89.99999999999999
120.00002289999992 120.00002289999995 89.99999999999999
120.00002289999992 120.00002289999995 89.99999999999999
120.00002289999992 120.00002289999995 89.99999999999999
120.00002289999992 120.00002289999992 89.99999999999999
120.00002289999992 120.00002289999992 89.99999999999999
I'm not yet sure if there are unintended consequences of doing this. I'll run some tests tomorrow and report back.
It is also completely stable on repeated read-write to both AMBER and GROMACS format:
In [1]: import BioSimSpace as BSS
INFO:numexpr.utils:Note: NumExpr detected 20 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
INFO:numexpr.utils:NumExpr defaulting to 8 threads.
WARNING:root:Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
In [2]: system = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.crd", "ala.top"]))
Downloading from 'https://biosimspace.openbiosim.org/m/ala.crd'...
Unzipping '/tmp/tmp0ydau7r5/ala.crd.bz2'...
Downloading from 'https://biosimspace.openbiosim.org/m/ala.top'...
Unzipping '/tmp/tmp0ydau7r5/ala.top.bz2'...
In [3]: box, angles = BSS.Box.rhombicDodecahedronSquare(3*BSS.Units.Length.nanometer)
In [4]: solvated = BSS.Solvent.tip3p(system[0], box=box, angles=angles)
In [5]: solvated.getBox()
Out[5]:
([30.0000 A, 30.0000 A, 30.0000 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [6]: s = solvated
In [7]: for x in range(0, 20):
...: BSS.IO.saveMolecules("test", s, ["prm7", "rst7"])
...: s = BSS.IO.readMolecules("test.*7")
...: print(s.getBox())
...:
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [8]: for x in range(0, 20):
...: BSS.IO.saveMolecules("test", s, ["grotop", "gro87"])
...: s = BSS.IO.readMolecules(["test.top", "test.gro"])
...: print(s.getBox())
...:
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
([30.0000 A, 30.0000 A, 30.0000 A], [60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
Note here that the initial box angles are still rotated when reading the box that was created by gmx editconf
, so it still would be desirable to prevent this, i.e. using a particular representation of "standard" boxes as at default.
Here's a previous Sire issue thread discussing problems caused by triclinic lattice representations and reductions with OpenMM and SOMD. I'll go through some of the test input there and make sure that I still preserve the correct reduction with the rounding edits.
Thanks - wow, that was a monster thread. And caused by a 1 part in 10^8 rounding error! This is such subtle code. We should have realised when the initial report included the phrase that it "shall be quite easy to fix" ;-)
I've tested the prepareFEP
input from that thread and it still runs with OpenMM with the rounding change. However, I no longer get the same reduction as OpenMM. (It actually preserves the original angles.) OpenMM is still happy with the box vectors that are generated, though, i.e. they satisfy the internal constraints of the engine. (Referring to this comment in particular.)
I'll see if I can load a range of inputs and get them to run with each engine that we support.
Yes, the phrase "should be easy to fix" is always the kiss of death!
The instability is only coming from equations 3.16, 3.17, and 3.18 in the thesis, i.e. where rounding occurs. Choosing a consistent direction for the rounding here fixes things, i.e.:
diff --git a/corelib/src/libs/SireVol/triclinicbox.cpp b/corelib/src/libs/SireVol/triclinicbox.cpp
index 8632d7d7..1eb2feb2 100644
--- a/corelib/src/libs/SireVol/triclinicbox.cpp
+++ b/corelib/src/libs/SireVol/triclinicbox.cpp
@@ -29,6 +29,7 @@
#include <QList>
#include <QPair>
+#include <cfenv>
#include <cmath>
#include <limits>
@@ -226,10 +227,16 @@ void TriclinicBox::construct(const Vector &v0, const Vector &v1, const Vector &v
These constraints can be solved using simultaneous equations.
*/
+ auto mode = std::fegetround();
+ std::fesetround(FE_TOWARDZERO);
+
// Perform the reduction.
- auto new_v1 = this->v1 - std::round(this->v1.x() / this->v0.x()) * this->v0;
- auto tmp = this->v2 - std::round(this->v2.y() / this->v1.y()) * this->v1;
- auto new_v2 = tmp - std::round(tmp.x() / this->v0.x()) * this->v0;
+ auto new_v1 = this->v1 - std::nearbyint(this->v1.x() / this->v0.x()) * this->v0;
+ auto tmp = this->v2 - std::nearbyint(this->v2.y() / this->v1.y()) * this->v1;
+ auto new_v2 = tmp - std::nearbyint(tmp.x() / this->v0.x()) * this->v0;
+
+ // Reset the rounding mode.
+ std::fesetround(mode);
Gives:
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
(The same would be generated casting to an int
, rather than using std::nearbyint
.)
That's excellent. I didn't know about std::nearbyint
either, so have learned something new :-)
Neither did I :-) It's nice that it respects the current rounding mode, which std::round
doesn't. You can also get stable behaviour by biasing std::round
, e.g:
Adding a 1e-7 offset still flips the angles:
// Perform the reduction.
auto new_v1 = this->v1 - std::round(1e-7 + this->v1.x() / this->v0.x()) * this->v0;
auto tmp = this->v2 - std::round(1e-7 + this->v2.y() / this->v1.y()) * this->v1;
auto new_v2 = tmp - std::round(1e-7 + tmp.x() / this->v0.x()) * this->v0;
Gives:
59.99999999999207 59.99999999999207 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
120.00000000000001 120.00000000000001 90.0
Whereas:
// Perform the reduction.
auto new_v1 = this->v1 - std::round(1e-6 + this->v1.x() / this->v0.x()) * this->v0;
auto tmp = this->v2 - std::round(1e-6 + this->v2.y() / this->v1.y()) * this->v1;
auto new_v2 = tmp - std::round(1e-6 + tmp.x() / this->v0.x()) * this->v0;
Gives:
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
120.0000229 120.0000229 90.0
It seems that the simplest solution is to add a small bias to the rounding as above, e.g.:
this->v2 = this->v2 - this->v1*std::round(1e-6 + this->v2.y() / this->v1.y());
this->v2 = this->v2 - this->v0*std::round(1e-6 + this->v2.x() / this->v0.x());
this->v1 = this->v1 - this->v0*std::round(1e-6 + this->v1.x() / this->v0.x());
For the triclinic box types that we support, i.e. those generated by BioSimSpace.Box
, I've checked that the angles are preserved on solvation and on repeated write/read to AMBER and GROMACS format. All engines are happy with the input, too.
The angles might flip if the box is non-standard, i.e. some random triclinic cell, but it will converge to a value, rather than oscillating. In practice, I doubt that we will be dealing with many (if any) systems of this type, so we should be okay. That said, I might look up some other common triclinic boxes that we don't currently support. (By support, I mean have a way of generating the box dimensions and angles internally, obviously we can read/write and solvate with any old values, as long as they aren't nonsense.)
Note that this approach effectively favours the box that is tilted to the right, if you started with a box with angles (60, 60, 90), then they would be converted to (120, 120, 90), but there would be no oscillation (assuming the values are close enough to 60 to start with).
I think this is ok, as long as we document it in the API.
Thanks, I'll document it in the BioSimSpace.Box
docs here, and will add some info to the TriclinicBox
constructor. I'll have a think to see if there's a more elegant solution, but I feel like anything is going to have some limitation.
Many thanks for getting to the bottom of this @chryswoods and @lohedges. In which sire
versions is this fix going to be available?
I'm currently in the process of backporting to 2023.2.3. I'm just ironing out some issues with the BioSImSpace tests, which had some assumptions based on the fact that the box angles could flip on reduction, which is no longer the case. I'll then do a single backport to the specific Sire version and Python variant that you need. I am correct in thinking that you'd like a 2023.1.4 for Python 3.8 on Linux?
Sounds great, thanks @lohedges, yes a 2023.1.4
that fixes this would be much appreciated!
This is now backported to 2023.1.4
for Linux Python 3.8 here. If you need more Python variants for Linux, just let me know. If you need other platforms, e.g. macOS, then we'd need to wait for @chryswoods to build them on his return.
Cheers.
That's amazing, thanks a lot @lohedges, I'll give it a try!
Sadly re-opening this, since the fix doesn't appear to work with OpenMM
for rhombic dodecahedron boxes, although I did test all box types previously :man_shrugging: For example:
In [1]: import BioSimSpace as BSS
In [2]: mol = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()
In [3]: box, angles = BSS.Box.rhombicDodecahedronHexagon(3*BSS.Units.Length.nanometer)
In [4]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)
In [5]: p = BSS.Protocol.Minimisation(steps=1000)
In [6]: process = BSS.Process.OpenMM(solvated, p)
In [7]: process.start()
Out[7]: BioSimSpace.Process.OpenMM(<BioSimSpace.System: nMolecules=622>, BioSimSpace.Protocol.Minimisation(steps=1000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin/sire_python', name='openmm', platform='CPU', work_dir='/tmp/tmpijqr1ske', seed=None)
In [8]: process.wait()
In [9]: process.isError()
Out[9]: True
In [10]: process.stderr()
/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
return self._float_to_str(self.smallest_subnormal)
Traceback (most recent call last):
File "/tmp/tmpijqr1ske/openmm_script.py", line 34, in <module>
simulation = Simulation(prmtop.topology,
File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/simulation.py", line 103, in __init__
self.context = mm.Context(self.system, self.integrator, platform, platformProperties)
File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/openmm.py", line 10069, in __init__
_openmm.Context_swiginit(self, _openmm.new_Context(*args))
openmm.OpenMMException: NonbondedForce: The cutoff distance cannot be greater than half the periodic box size. For more information, see https://github.com/openmm/openmm/wiki/Frequently-Asked-Questions#boxsize
In [11]: box, angles = BSS.Box.rhombicDodecahedronSquare(3*BSS.Units.Length.nanometer)
In [12]: solvated = BSS.Solvent.tip3p(mol, box=box, angles=angles)
In [13]: process = BSS.Process.OpenMM(solvated, p)
In [14]: process.start()
Out[14]: BioSimSpace.Process.OpenMM(<BioSimSpace.System: nMolecules=632>, BioSimSpace.Protocol.Minimisation(steps=1000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin/sire_python', name='openmm', platform='CPU', work_dir='/tmp/tmpmd9xk4t6', seed=None)
In [15]: process.wait()
In [16]: process.isError()
Out[16]: True
In [17]: process.stderr()
Traceback (most recent call last):
File "/tmp/tmpmd9xk4t6/openmm_script.py", line 16, in <module>
prmtop = AmberPrmtopFile('openmm.prm7')
File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/amberprmtopfile.py", line 156, in __init__
top.setPeriodicBoxVectors(computePeriodicBoxVectors(*(box[1:4] + box[0:1]*3)))
File "/home/lester/.conda/envs/openbiosim/lib/python3.10/site-packages/openmm/app/internal/unitcell.py", line 60, in computePeriodicBoxVectors
cz = math.sqrt(c_length*c_length-cx*cx-cy*cy)
ValueError: math domain error
The box reduction is consistent, i.e. doesn't oscillate, but OpenMM
is unhappy with it. There is no issue with the other engines that we support, or using the other triclinic box types with OpenMM
. I'll see if there is a simple workaround.
Actually, this issue seems to be independent of the previous fix, i.e. I get the exact same errors if I don't bias the rounding. This is strange, since internally OpenMM
uses the same lattice reduction as us. I'll close this thread and open a new issue to debug.
Re-opening this since the rounding "fix" appears to cause issues with OpenMM for certain input. Internally, OpenMM applies it's own internal lattice reduction check and this is failing with a ~1e-7 rounding error. See #59 for details and this particular comment for a full explanation of the issue.
We are not yet sure if the issue would be a problem for all input of this type, or whether it's a result of applying the new reduction scheme to SOMD input files generated with the old reduction scheme. Will report back when we obtain inputs to tests with.
I can confirm that the issue is with the original input to Flare/prepareFEP. Will see if I can adjust the rounding scheme so that it works with their LEaP
output too.
@xiki-tempula, @msuruzhon: Just tagging you into this discussion.
It turns out that the previous fix, in which we biased the rounding to achieve stable box angles under repeat lattice reduction, caused issues for input used by other industrial partners. With their default triclinic space, OpenMM (and therefore SOMD) would crash on startup due to the internal check for reduced lattice vectors failing. See my comment here for a full explanation of the issue. Note that the check was failing due to a 1e-7 rounding error!
I've revisited the issue and have found that we can solve the issue by biasing the rounding in the opposite direction to the previous fix, i.e. towards left tilting boxes. With this change OpenMM is happy with the resulting lattice reduction and we still achieve stable box angles under repeated lattice reduction and conversion to/from different molecular formats. It is also possible to use a smaller bias, which makes the reduction more accurate than before.
With this change, solvating in a system with angles of (120, 120, 90)
will flip to (60, 60, 90)
, but will be stable thereafter. Note that (60, 60, 90)
is the default used by GROMACS, as described here, and is the value that would be generated if you use the internal BioSimSpace.Box
functionality, e.g.:
In [1]: import BioSimSpace as BSS
In [2]: mol = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()
In [3]: solvated = BSS.Solvent.tip3p(mol.toSystem(), box=3*[4*BSS.Units.Length.nanometer], angles=[120*BSS.Units.Angle.degree, 120*BSS.Units.
...: Angle.degree, 90*BSS.Units.Angle.degree])
In [4]: solvated.getBox()
Out[4]:
([40.0000 A, 40.0000 A, 40.0000 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
With the change in rounding fix, I've run simulations with each engine that we support to make sure that the angles are preserved under NPT simulations and conversion to and from different formats. In doing so I've discovered that the bias needs to be sufficiently large (about 1e-8) to stop GROMACS and MDAnalysis flipping the angles. See below for some output from short NPT runs with AMBER(sander) and GROMACS.
In [1]: import BioSimSpace as BSS
In [2]: mol = BSS.Parameters.openff_unconstrained_2_0_0("C").getMolecule()
In [3]: box, angles = BSS.Box.rhombicDodecahedronSquare(4*BSS.Units.Length.nanometer)
In [4]: angles
Out[4]: [60.0000 degrees, 60.0000 degrees, 90.0000 degrees]
In [5]: solvated = BSS.Solvent.tip3p(mol.toSystem(), box=box, angles=angles)
In [6]: solvated.getBox()
Out[6]:
([40.0000 A, 40.0000 A, 40.0000 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [7]: p = BSS.Protocol.Minimisation(steps=1000)
In [8]: process = BSS.Process.Gromacs(solvated, p)
In [9]: process.start()
Out[9]: BioSimSpace.Process.Gromacs(<BioSimSpace.System: nMolecules=1498>, BioSimSpace.Protocol.Minimisation(steps=1000, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/usr/local/gromacs/bin/gmx', name='gromacs', work_dir='/tmp/tmppas4j9h5', seed=None)
In [10]: process.wait()
In [11]: minimised = process.getSystem()
In [12]: minimised.getBox()
Out[12]:
([40.0000 A, 40.0000 A, 40.0000 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [13]: p = BSS.Protocol.Equilibration(pressure=BSS.Units.Pressure.atm, runtime=0.02*BSS.Units.Time.nanosecond)
In [14]: process = BSS.Process.Gromacs(minimised, p)
In [15]: process.start()
Out[15]: BioSimSpace.Process.Gromacs(<BioSimSpace.System: nMolecules=1498>, BioSimSpace.Protocol.Equilibration(timestep=2.0000 fs, runtime=0.0200 ns, temperature_start=300.0000 K, temperature_end=300.0000 K, pressure=1.0000 atm, report_interval=100, restart_interval=500, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/usr/local/gromacs/bin/gmx', name='gromacs', work_dir='/tmp/tmpegduaq1q', seed=None)
In [16]: process.wait()
In [17]: equilibrated = process.getSystem()
In [18]: equilibrated.getBox()
Out[18]:
([40.1424 A, 40.1424 A, 40.1424 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [19]: process.getTrajectory().getFrames()[-1].getBox()
Out[19]:
([40.1424 A, 40.1424 A, 40.1424 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [20]: process = BSS.Process.Amber(minimised, p)
In [21]: process.start()
Out[21]: BioSimSpace.Process.Amber(<BioSimSpace.System: nMolecules=1498>, BioSimSpace.Protocol.Equilibration(timestep=2.0000 fs, runtime=0.0200 ns, temperature_start=300.0000 K, temperature_end=300.0000 K, pressure=1.0000 atm, report_interval=100, restart_interval=500, restraint=None, force_constant=10.0 kcal_per_mol/angstrom**2), exe='/home/lester/.conda/envs/openbiosim/bin/sander', name='amber', work_dir='/tmp/tmpmn93beu_', seed=None)
In [22]: process.wait()
In [23]: process.getSystem().getBox()
Out[23]:
([40.1605 A, 40.1605 A, 40.1605 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
In [24]: process.getTrajectory().getFrames()[-1].getBox()
Out[24]:
([40.1605 A, 40.1605 A, 40.1605 A],
[60.0000 degrees, 60.0000 degrees, 90.0000 degrees])
Hopefully this is a satisfactory solution. I'm going to run some longer simulations just to check, then will submit a PR when I'm happy.
Thank you.
We are now happy with the code in the PR and have merged it into devel
. We think this has now fixed the issue ;-)
Please do reopen if there are any more problems.
Describe the bug Solvate a system with rhombic dodecahedron square box would give a non-integer degree
To Reproduce
gives 119.99998318630921
Expected behavior gives 120 or at least to the precision of double