michellab / Sire

Sire Molecular Simulations Framework
http://siremol.org
GNU General Public License v3.0
95 stars 26 forks source link

Sire::IO::AmberPrm topology file giving strange MD results #338

Closed AdeleHardie closed 3 years ago

AdeleHardie commented 3 years ago

I am running some simple production MD, and am seeing protein unfolding where there shouldn't be any. This is the code I run:

import BioSimSpace as BSS
from shutil import copyfile

system = BSS.IO.readMolecules(['system_bash.prm7', 'system_bash.rst7'])
protocol = BSS.Protocol.Production(runtime=5*BSS.Units.Time.nanosecond, restart_interval=2500)
process = BSS.Process.Amber(system, protocol, exe='/home/adele/software/amber20_20.04/amber20/bin/pmemd.cuda')

process.start()
process.wait()
copyfile(f'{process.workDir()}/amber.nc', 'production_BSS.nc')

This results in my protein start to unfold (particularly near the N terminal, highlighted in blue): image I have also seen complete unfolding after 100 ns previously (which was what led me to investigate): Outlook-tvjmo5nz

However, when I run the following:

import BioSimSpace as BSS
from shutil import copyfile

system = BSS.IO.readMolecules(['system_bash.prm7', 'system_bash.rst7'])
protocol = BSS.Protocol.Production(runtime=5*BSS.Units.Time.nanosecond, restart_interval=2500)
process = BSS.Process.Amber(system, protocol, exe='/home/adele/software/amber20_20.04/amber20/bin/pmemd.cuda')

copyfile('system_bash.prm7', f'{process.workDir()}/amber.prm7')
copyfile('system_bash.rst7', f'{process.workDir()}/amber.rst7')

process.start()
process.wait()
copyfile(f'{process.workDir()}/amber.nc', 'production_BSS.nc')

I get a stable protein, as expected and as I've seen for MD runs outside of BSS. Is it possible that the reading/writing of the system is resulting in a topology different enough to destabilise protein on such a short timescale?

I'm including relevant files: file-diff.zip

(I also tried this with only copying over the original topology file and using the BSS generated rst, which worked fine; this is pointing me towards the topology being an issue).

I am using the dev version of BSS downloaded yesterday (02/02), but the release version downloaded today (03/02) gave the same results.

On import I get: /home/adele/anaconda3/envs/BSS-env/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ndarray size changed, may indicate binary incompatibility. Expected 80 from C header, got 88 from PyObject return f(*args, **kwds) but michellab/BioSimSpace#37 makes me think it shouldn't matter?

Might be unrelated (or a seprate issue), but when I run:

protein = BSS.IO.readMolecules('open.pdb')
protein_parm_process = BSS.Parameters.ff14SB(protein.getMolecule(0))
protein_parm = protein_parm_process.getMolecule()
solvated = BSS.Solvent.tip3p(protein_parm.toSystem(), shell=10*BSS.Units.Length.angstrom, ion_conc=0.15, is_neutral=True)
minimisation_protocol = BSS.Protocol.Minimisation(steps=2000)
minimisation_process = BSS.Process.Amber(solvated, minimisation_protocol)
minimisation_process.start()
minimisation_process.wait()
minimised = minimisation_process.getSystem()

I get the following Sire issue: The property of type SireMol::AtomCoords is incompatible with the layout with UID {d01e83b8-690a-45b5-9e27-73ebb2651b85} IncompatibleError: Unable to update 'coordinates' for molecule index '1' When I load the molecule fresh from the workDir I can continue. Additionally, putting a system parameterised/solvated in leap outside BSS through the same process does not raise an error. Let me know if you think this is related, or if this a separa/non issue.

jmichel80 commented 3 years ago

ah yes that's a huge difference in 1,4 EEL energies

Something wrong with the 1,4 scaling factors ?

lohedges commented 3 years ago

Thanks for the update, that's really helpful. I feel like we're getting somewhere.

I see that ParmEd agrees with the tLEaP files, so it's only the ones generated by Sire.IO.AmberPrm that are giving weird results. I also note that the ParmEd and tLEaP topology files give the same energies (within error) using the frame_0.rst7 coordinates with both sander and pmemd.cuda, i.e.

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.3060E+05     1.4601E+01     1.2988E+02     N        3128

 BOND    =      935.0701  ANGLE   =     2453.6861  DIHED      =     3859.7638
 VDWAALS =    14410.2600  EEL     =  -165042.1226  HBOND      =        0.0000
 1-4 VDW =     1087.9674  1-4 EEL =    11691.6519  RESTRAINT  =        0.0000

In contrast, for Sire.IO.AmberPrm we mach with sander but have the following with pmemd.cuda:

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.4274E+05     1.4618E+01     1.3410E+02     N        3128

 BOND    =      935.0701  ANGLE   =     2453.6861  DIHED      =     3859.7643
 VDWAALS =    14410.2669  EEL     =  -165042.1627  HBOND      =        0.0000
 1-4 VDW =      941.7919  1-4 EEL =     -299.4411  RESTRAINT  =        0.0000

(Note that the pmemd.cuda files provided referred to frames from the stable trajectory.)

The fact that the Sire.IO.AmberPrm topology gives different energies at step zero implies that this isn't an error that is accumulated as the simulation progresses, rather an error that is present from the outset.

Pardon my ignorance, but are there terms in the topology file that are only used by pmemd, or is it just that the potential calculation is performed in a different. way. I'm just trying to pinpoint what parts of the topology file to focus on. From the energies you would assume that there is somethng wrong with the 1-4 factors (as @jmichel80 suggests), but the energies for these terms are the same when using sander.

It would be good to know if this issue is unique to this system, or a more general problem.

lohedges commented 3 years ago

I guess we need to check SCEE_SCALE_FACTOR and SCNB_SCALE_FACTOR terms. Since these apply to dihedrals, I wonder if there is an issue with regards to how these potential terms are accumulated on the GPU. I note this in the Sire code:

//get the NB14 scaling factor for this dihedral. This should be set
        //for only the first dihedral found that has this pair, otherwise
        //we would risk Amber double-counting this parameter

Perhaps this double counting will never occur with sander, but could be problematic with pmemd.

I think we also treat represent the dihedral terms slightly differently (although equivalently, mathematically) and perhaps this causes issues with the way the potential is computed on the GPU?

jmichel80 commented 3 years ago

Perhaps that would explain the differences on the 3rd or 4th decimal for DIHED, EEL, VDWAALS but I doubt the 1,4 terms, the difference is very large. a parsing error in pmemd is more likely. Check if the same behaviour occurs with pmemd.mpi

Comparing the SCEE_SCALE_FACTOR and SCNB_SCALE_FACTOR entries I see that system_BSS.prm7 has the first 19 entries set to 0.0 whereas system_leap.prm7 and system_parmed.prm7 only have the last 4 entries set to 0.0 .

Related to issues we encountered a few years ago when debugging the Sire prm7 writer ?

https://github.com/michellab/BioSimSpace/issues/47 https://github.com/michellab/Sire/issues/233

Perhaps we have to dig in in the sander and pmemd sources to understand why they would handle differently the same input prm7. Note this comment in pmemd source code prmtop_dat.F90 !We found the SCEE scale factor data so read it in, there should be !a value for each dihedral type. Note while there is one for each !dihedral type not all are actually used in the calculation since !1-4's are only done over unique dihedral terms. If multiple dihedrals !for a given set of atom types exist, with different pn's for example !then the last value of SCEE/SCNB should be used.

Perhaps pmemd makes implicit assumptions that sander doesn’t.

It would make sense that the issue is with multi term dihedrals as these are more frequently used for backbone torsions. Messing up the backbone torsions would explain the unfolding behaviour reported by Adele.

lohedges commented 3 years ago

Thanks for the insight, much appreciated. The ordering and number of terms will be different due to the way that we treat dihedrals. Since the parser originally worked in parallel we choose to sort dihedral terms before writing to ensure consistent ordering. (This allows us to check files using diffs, and also preserve the topology on read/write round trips.) I believe that there will also be a different number of entries due to the way that we handle multi-part dihedrals and impropers.

For the parallel part: I think we've already disabled parallel parsing due to a subtle issue discovered by Paolo that was never resolved. As a quick check, I can just turn off the sorting for dihedrals to see if that changes things. (If we aren't parsing in parallel then the ordering will be consistent anyway.)

Looking at the other thread you linked to (and the thread linked from that) it does seem that the issue could be related. Here is the correct issue discussion in the Sire repository. It seems that we have some self-consistency problems with our issue tracking and closing. (Why I prefer to keep things in one thread in the repo to which it corresponds :-) ) As such, I'm not sure if that issue was ever suitably resolved. I'll go back through the discussion to see if there is any info that might give us more of an insight into this problem.

I might try to recruit @chryswoods tomorrow morning, if he has time.

Cheers.

lohedges commented 3 years ago

Just a note from the other issue thread regarding the difference in the number of terms in the POINTERS record:

Be careful chasing "POINTERS" - there will be differences because we obtain a different number of 1-4 parameters, dihedral parts and CLJ parameters when reading from a system. This is because there are different equivalent ways of working this out, and so we cannot preserve the exact information during a AmberPrm => System => AmberPrm conversion.

lohedges commented 3 years ago

I've just played around with the sorting and this looks like it is required, since I don't get consistent energies with sander if I disable it.

jmichel80 commented 3 years ago

I suggest to also post a query on the amber mailing list, their developers may want to know why a prm7 input gives inconsistent energies between sander and pmemd. Check also pmemd on CPU as the issue may be with the cuda kernel only

lohedges commented 3 years ago

I've just checked regular CPU pmemd and the issue still persists. The Sire.IO.AmberPrm topology gives the same energies as with pmemd.cuda, so it's not related to the GPU.

lohedges commented 3 years ago

Looking above at the comment from the pmemd source, this seems to constrast with the comment in Sire:

pmemd

!We found the SCEE scale factor data so read it in, there should be
!a value for each dihedral type. Note while there is one for each
!dihedral type not all are actually used in the calculation since
!1-4's are only done over unique dihedral terms. If multiple dihedrals
!for a given set of atom types exist, with different pn's for example
!then the last value of SCEE/SCNB should be used.

Sire

//get the NB14 scaling factor for this dihedral. This should be set
        //for only the first dihedral found that has this pair, otherwise
        //we would risk Amber double-counting this parameter

Sire is using the first dihedral as its reference, pmemd is using the last. I'll try changing the Sire code (perhaps writing the term for all pairs as a start) and post an updated topology.

lohedges commented 3 years ago

Actually, I think Sire takes care to ensure that only unique dihedral terms are written to the file, so this shouldn't be an issue.

lohedges commented 3 years ago

I've had a look through the ParmEd code and they seem to have a fix for pmemd. In particular:

# pmemd likes to skip torsions with periodicities of 0, which may be
        # present as a way to hack entries into the 1-4 pairlist. See
        # https://github.com/ParmEd/ParmEd/pull/145 for discussion. The solution
        # here is to simply set that periodicity to 1.
        inst._cleanup_dihedrals_with_periodicity_zero()
...

def _cleanup_dihedrals_with_periodicity_zero(self):
        """
        For torsions with only a single term and a periodicity set to 0, make sure pmemd still
        properly recognizes the necessary exception parameters. update_dihedral_exclusions will
        make sure that if a dihedral has a type pn0 *and* ignore_end is set to False (which means
        that it is required to specify exclusions), then it is the *only* torsion between those
        atoms in the system. This allows us to scan through our dihedrals, look for significant
        terms that have pn==0, and simply add another dihedral with pn=1 and k=0 to ensure that
        pmemd will always get that exception correct
        """
        new_dihedrals = []
        for dih in self.dihedrals:
            if dih.ignore_end or dih.type.per != 0:
                continue
            # If we got here, ignore_end must be False and out periodicity must be 0. So add
            # another dihedral
            dt = DihedralType(0, 1, 0, dih.type.scee, dih.type.scnb, list=self.dihedral_types)
            self.dihedral_types.append(dt)
            new_dihedrals.append(
                Dihedral(dih.atom1, dih.atom2, dih.atom3, dih.atom4, improper=dih.improper,
                         ignore_end=False, type=dt)
            )
            # Now that we added the above dihedral, we can start ignoring the end-group interactions
            # on this dihedral
            dih.ignore_end = True
        if new_dihedrals:
            self.dihedrals.extend(new_dihedrals)

Perhaps this is our issue?

lohedges commented 3 years ago

We don't have any zero periodicity single-term dihedrals, so this isn't the problem.

lohedges commented 3 years ago

I've tried a different protein-ligand system (one of Dom's funnel examples) and the energies also differ for the same terms using pmemd, so this appears to be a general issue. In the workshop we will use these files with OpenMM, which parses things with ParmEd. I'm not sure we'll there will be potential issues there too.

jmichel80 commented 3 years ago

we can establish this by computing single point energies with BSS, leap, parmed topologies using OpenMM ? As the energies are consistent with sander and Adele reports Gromacs works fine with BSS generated topologies the issue may be contained to pmemd.

lohedges commented 3 years ago

Just testing this now. I'm also checking what happens if you read/write the Sire.IO.AmberPrm topology file with ParmEd and use that with pmemd, i.e. do the problematic terms get fixed?

lohedges commented 3 years ago

If I read/write the Sire.IO.AmberPrm topology with ParmEd, then the file is identical, other than the version stamp, i.e. the energies would agree when using pmemd. There are no warnings etc when parsing the file.

lohedges commented 3 years ago

Hmm, OpenMM reports different energies for all three topology files, even when performing the minimisation using a Verlet integrator with a ridiculously small time step:

tLEaP

#"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)"
1 1e-20 -655892.606114115 3263805358977.2705 3263804703084.6646 9454589266.097189 505.85285946764645 0

ParmEd

"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)"
1 1e-20 -682508.7949292829 3336335905771.491 3336335223262.6963 9664695707.432182 402.63291614831616 0

Sire

#"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)" "Box Volume (nm^3)" "Speed (ns/day)" "Time Remaining"
1 1e-20 -682937.7066492349 3408511368084.4966 3408510685146.79 9873773540.270222 402.63291614831616 0 

I don't know why OpenMM reports a box volume of zero in all cases, or why the temperature agrees for the Sire and ParmEd topologies, but not for the tLEaP one?

jmichel80 commented 3 years ago

‘If I read/write the Sire.IO.AmberPrm topology with ParmEd, then the file is identical, other than the version stamp, i.e. the energies would agree when using pmemd. There are no warnings etc when parsing the file.’

—> is this a plausible temporary bugfix ?

lohedges commented 3 years ago

Okay, it looks like OpenMM defaults to using the box dimensions from the topology file (which is now deprecated by AMBER) and you need to explicitly override this when it is also set in the coordinate file, i.e.

if inpcrd.boxVectors is not None:
    simulation.context.setPeriodicBoxVectors(*inpcrd.boxVectors)

After adding this, then I get the following energies:

tLEaP

#"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)" "Box Volume (nm^3)"
1 1e-20 -682557.4473130001 3339611920719.1265 3339611238161.679 9674185665.426857 402.63291614831616 0 

ParmEd

#"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)" "Box Volume (nm^3)" 
1 1e-20 -682789.2358909 3328770219475.9766 3328769536686.7407 9642779432.234129 402.63291614831616 0

Sire

#"Step" "Time (ps)" "Potential Energy (kJ/mole)" "Kinetic Energy (kJ/mole)" "Total Energy (kJ/mole)" "Temperature (K)" "Box Volume (nm^3)"
1 1e-20 -682578.6383103809 3322197876923.014 3322197194344.376 9623740674.551035 402.63291614831616 0

Again, the energies differ in all cases, although they are now closer.

(I'm also not sure why the temperature and volume records are flipped in the log file. I've checked the state reporter and can't see anything wrong, so perhaps this is something weird within OpenMM?)

—> is this a plausible temporary bugfix ?

No, since the energies are wrong in both cases. Apologies if it wasn't clear, I was saying that ParmEd doesn't change the problematic Sire topology in any way, whereas it does have differences when reading/writing the original tLEaP topology, even though the pmemd energies are the same.

lohedges commented 3 years ago

Temperature and volume are correct, I was just misreading the number of terms. (The zero is from another reporter that I had truncated from the header.)

lohedges commented 3 years ago

Just to note: If we parse system_parmed.prm7 with Sire.IO.AmberPrm and write it back to file, then we generate exactly the same topology as we would do from reading/writing the original tLEaP files.

We have this situation...

(Here were are comparing the contents of the files written by the three programs, having read from the format to the left.)

tLEaP != tLEaP --> Sire

tLEaP != tLEaP --> ParmEd

tLEaP --> Sire == tLEaP --> ParmEd --> Sire

tLEaP --> Sire == tLEaP --> Sire --> ParmEd

It's interesting that when parsing the ParmEd output (which looks different to our topology and that of tLEaP) the file written back to disk does agree with our topology, i.e. the file diff is empty. Similarly, if we use ParmEd to parse our topology and write that back to file then the file is unchanged. ParmEd is also self-consistent with respect to its own output, i.e. that is unchanged on repeated read/write operations.

lohedges commented 3 years ago

Okay, we get energy differences with pmemd for a solvated alanine-dipeptide system. (Not sure why I didn't think of testing something simpler before.) At least this will give us something simpler to debug. Just seeing if have a vacuum system anywhere, which would make things even easier.

lohedges commented 3 years ago

I just generated one myself using tLEaP. In vacuum, the pmemd energies agree.

lohedges commented 3 years ago

At least, I generated this with ff14SB so don't know if the issue is force-field specific.

AdeleHardie commented 3 years ago

To add to some example systems, I have run 10 ns simulations with pmemd.cuda with Sire and tLeap topologies for FXa and BACE. Both systems were unstable when using Sire topologies: BACE: BACE FXa: fxa in both images, the green structure is the starting point, blue is tLeap topology aftern 10 ns and magenta is Sire topology after 10 ns.

lohedges commented 3 years ago

I solvated the vacuum alanine-dipeptide in tLEaP using a TIP3PBOX (following this tutorial) and the resulting structure loaded and saved with Sire.IO.AmberPrm gives incorrect energies:

tLEaP

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -5.6859E+03     1.2214E+01     1.0417E+02     O          23

 BOND    =        0.0540  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =      711.0740  EEL     =    -6461.0248  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000

ParmEd

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -5.6859E+03     1.2214E+01     1.0417E+02     O          23

 BOND    =        0.0540  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =      711.0740  EEL     =    -6461.0248  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000

Sire

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -5.7492E+03     1.2213E+01     1.0417E+02     O          23

 BOND    =        0.0540  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =      711.0740  EEL     =    -6461.0248  HBOND      =        0.0000
 1-4 VDW =        3.3048  1-4 EEL =      -12.6096  RESTRAINT  =        0.0000

I feel like we're getting closer, although I'm not sure what could be different in the treatment of water molecules between sander and pmemd. Note that I am doing all this directly with Sire, so it's not a result of BioSimSpace swapping the water topology behind the scenes. (We don't do this if it already matches the AMBER template anyway.)

jmichel80 commented 3 years ago

@lohedges the fact the the 1,4 terms are always less positive in Sire suggests that a number of 1,4 interactions are skipped or zeroed due to an incorrect scaling factor. You could hack parm/parm10.dat and parm/frcmod.ff14SB to remove all multidihedral terms (all the dihedral entries with a negative number in the 5th column). Then setup a new topology with tleap and see if that makes the 1,4 energies agree between codes. If the issue still persists then it must be an indexing error, though I don't see why at first it would happen after addition of TIP3P water (since those molecules don't have 1,4 terms).

lohedges commented 3 years ago

I've just checked Sire 2019.1.0 for my own sanity and the issue is still present. It seems like it is something that has been present with the parser from the outset, rather than anything to do with recent fixes for incorrect POINTERS, etc.

I"ll try your suggestions and report back if I make any progress.

lohedges commented 3 years ago

I've removed multidihedral terms as suggested and it makes things worse. The 1-4 VDW is similar, but the 1-4 EEL term is even less positive than before. Both the original tLEaP and ParmEd topologies still give the same energies. (Different to before, but the same as each other.)

jmichel80 commented 3 years ago

I have run tests on alanine dipeptide in vacuum with the following inputs vacuum.zip

and I observe inconsistencies in single point energies in vacuum with pmemd.

(to reproduce extract the attached zip, edit COMPARE_SP to point to the paths for sander/pmemd and execute the bash script.

================= SANDER BSS single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.3334E+01     5.4688E+00     1.8394E+01     N           7

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -80.1238  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000
================= SANDER leap single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.3334E+01     5.4688E+00     1.8394E+01     N           7

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -80.1238  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000
================= PMEMD BSS single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -6.0261E+01     1.3445E+01     4.1055E+01     HH32       21

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -63.7951  HBOND      =        0.0000
 1-4 VDW =        3.3048  1-4 EEL =      -12.6096  RESTRAINT  =        0.0000
================= PMEMD leap single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       2.9947E+00     1.3750E+01     4.3485E+01     HH32       21

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -63.7951  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000

Note that pmemd and sander EEL energies differ because PMEMD seems to still use PME in a non periodic input whereas sander uses coulombic without cutoff.

jmichel80 commented 3 years ago

An update - I deleted the entries %FLAG SCEE_SCALE_FACTOR and %FLAG SCNB_SCALE_FACTOR from both leap an bss topologies and repeated the single point energy calculations. I obtained identical results, with both codes indicating they used default SCEE/SCNB values. Therefore the discrepancy in the calculation of 1,4 non bonded terms is not due to the 1,4 scaling factors.

lohedges commented 3 years ago

Hmm, interesting. I wonder what was different with our vacuum systems. Perhaps I had simply copied across the wrong file to the cluster in my debugging haze.

jmichel80 commented 3 years ago

Another update. I zeroed the charges on all atoms but the first 1,4 pair (ACE1:H1 and ACE1:O). This gives the same 1,4 EEL single point energy between sander and pmemd with a BSS topology.

I also removed all negative indices in the 3rd entry of each dihedrals in both topologies. This would cause over-counting of 1,4 interactions with multi term dihedrals. This manifests as a change in 1,4 pmemd energies for both topologies, but NOT for sander.

================= SANDER BSS single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.3334E+01     5.4688E+00     1.8394E+01     N           7

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -80.1238  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000
================= SANDER leap single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -1.3334E+01     5.4688E+00     1.8394E+01     N           7

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -80.1238  HBOND      =        0.0000
 1-4 VDW =        5.0157  1-4 EEL =       48.9355  RESTRAINT  =        0.0000
================= PMEMD BSS single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1      -6.0532E+01     1.4265E+01     4.1055E+01     HH32       21

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -63.7951  HBOND      =        0.0000
 1-4 VDW =        5.6818  1-4 EEL =      -15.2570  RESTRAINT  =        0.0000
================= PMEMD leap single point vacuum ==========

   NSTEP       ENERGY          RMS            GMAX         NAME    NUMBER
      1       4.4690E+01     1.5820E+01     4.3485E+01     HH32       21

 BOND    =        0.0206  ANGLE   =        0.3620  DIHED      =        9.6440
 VDWAALS =        2.8120  EEL     =      -63.7951  HBOND      =        0.0000
 1-4 VDW =        8.5638  1-4 EEL =       87.0830  RESTRAINT  =        0.0000  

What this indicates is that sander (at least the version in AmberTools20) uses its own nternal logic to work out which 1,4 pairs to consider for non bonded calculations, whereas pmemd relies on a particular layout of FLAG DIHEDRALS_INC_HYDROGEN and FLAG DIHEDRALS_WITHOUT_HYDROGEN to do so....

lohedges commented 3 years ago

Thanks, @jmichel80 this is really helpful. I feel like we're close to getting to the bottom of it. I'll check to see how those records are generated in Sire and also see how ParmEd writes them in case there are any clues as to the specific ordering that is required.

lohedges commented 3 years ago

Just to say that I re-tried removing the dihedral sorting and I still get the same incorrect energies with pmemd. (Before I got different incorrect energies, but I disabled sorting for all terms, rather than just the dihedrals.)

lohedges commented 3 years ago

This is the code that supposedly takes care of the double counting issue:

        auto ignore_14 = [](QVector<qint64> &dihs)
        {
            QSet<QPair<qint64, qint64> > seen_dihedrals;
            seen_dihedrals.reserve(dihs.count()/5);

            for (int i=0; i<dihs.count(); i+=5)
            {
                //NB14 must be computed only the first dihedral found that has this pair,
                //otherwise we would risk Amber double-counting this parameter
                const QPair<qint64, qint64> nb14pair(dihs[i], dihs[i+3]);

                if (seen_dihedrals.contains(nb14pair))
                {
                    dihs[i+2] = -dihs[i+2];
                }
                else if (dihs[i+2] > 0)
                {
                    seen_dihedrals.insert(nb14pair);
                }
            }
        };

        ignore_14(all_dihs_inc_h);
        ignore_14(all_dihs_exc_h);

Disabling this (by commenting out the last two lines) makes no difference to the sander results, but changes the 1-4 VDW and 1-4 EEL values. (Still incorrect, but different.)

I am assuming that pmemd requires the NB14 to be set for a specific dihedral to be set. The source code suggests the last entry if multiple dihedrals contain the same pair, but inverting the order of the loop makes no difference to the result. I then wondered if it wouldn't work if we sorted dihedrals, but that makes no difference either. I've also tried disabling parallelisation while parsing (in case that changed the order), but again no difference.

I'm still getting my head around the logic of the code above, e.g. why the 3rd atom index in the dihedral is inverted if ignored. Perhaps there is something simple that is wrong here.

I was also wondering if, following the above, we then need to ourder the DIHEDRALS_INC_HYDROGEN and DIHEDRALS_WITHOUT_HYDROGEN such that the terms that aren't ignored are last? I'll check the ordering of the original tLEaP files for your vacuum system to see if it throws up any clues.

I feel like we're close.

lohedges commented 3 years ago

Comparing the two files, it looks like all of the tLEaP entries are sorted, whereas ours aren't, even though sorting is turned on. Perhaps something funny is going on with the sorting, I'll try to reproduce the same ordering.

lohedges commented 3 years ago

Scratch that, looking at the files the wrong way round :man_facepalming:

jmichel80 commented 3 years ago

I also compared the 1,4 pairs list implied by the dihedral entries and found exactly the same pairs albeit in a different order. Everything suggests the issue is the different ordering. Perhaps causing incorrect lookup of charges (DIH is consistent so the coordinates are consistent). We may be able to work this out by recompiling pmemd with verbose/debugging statements.

lohedges commented 3 years ago

Looking at the files, the obvious difference is that the impropers always come after dihedrals with tLEaP, whereas the Sire output is interleaved. It might be a red herring, but at least it's something easy to test.

lohedges commented 3 years ago

Red herring it is.

jmichel80 commented 3 years ago

I found the problem by digging in pmemd's source code. The problem is that the bss topology of the alanine dipeptide the 3rd entry of the DIHEDRAL_PERIODICITY section is 0

%FLAG DIHEDRAL_PERIODICITY %FORMAT(5E16.8) 2.00000000E+00 2.00000000E+00 0.00000000E+00 3.00000000E+00 3.00000000E+00 1.00000000E+00 2.00000000E+00 2.00000000E+00 3.00000000E+00 3.00000000E+00 1.00000000E+00 3.00000000E+00 1.00000000E+00 3.00000000E+00 2.00000000E+00 2.00000000E+00 1.00000000E+00 2.00000000E+00

I don't know why we could have torsions with 0 periodicity.

This causes pmemd to disable every non-bonded pair with this parameter in subroutine calc_dihedral_parms.

Simply changing this to entry to a positive number gives agreement of single point energies between the bss and leap topologies.

The torsional energies would still be consistent because the dihedral force constant is 0 for this entry.

@lohedges now just need to figure out why BSS writes down null dihedral parameters..

lohedges commented 3 years ago

Fantastic! I'll take a look when I fire the computer up.

On Mon, 26 Apr 2021, 00:22 Julien Michel, @.***> wrote:

I found the problem by digging in pmemd's source code. The problem is that the bss topology of the alanine dipeptide the 3rd entry of the DIHEDRAL_PERIODICITY section is 0

%FLAG DIHEDRAL_PERIODICITY %FORMAT(5E16.8) 2.00000000E+00 2.00000000E+00 0.00000000E+00 3.00000000E+00 3.00000000E+00 1.00000000E+00 2.00000000E+00 2.00000000E+00 3.00000000E+00 3.00000000E+00 1.00000000E+00 3.00000000E+00 1.00000000E+00 3.00000000E+00 2.00000000E+00 2.00000000E+00 1.00000000E+00 2.00000000E+00

I don't know why we could have torsions with 0 periodicity.

This causes pmemd to disable this non-bonded pair in subroutine calc_dihedral_parms.

Simply changing this to entry to a positive number gives agreement of single point energies between the bss and leap topologies.

The torsional energies would still be consistent because the dihedral force constant is 0 for this entry.

@lohedges https://github.com/lohedges now just need to figure out why BSS writes down null dihedral parameters..

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/michellab/Sire/issues/338#issuecomment-826412837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAE6K3MD2XUH2PQ2G33W3ITTKSP3LANCNFSM4XBF2CUQ .

lohedges commented 3 years ago

I've compared the dihedral and improper terms from the Sire.MM.AmberParams object contructed on read to the individual Sire.MM.AmberDihedral objects that are reconstructed using Sire.CAS when we convert back to AMBER format from a Sire.System. I don't see any zero periodicities, other null dihedrals.

The issue you found from digging through the source code does seem to be the same issue that was flagged by the ParmEd team, which I posted above. I'll copy here for reference:

# pmemd likes to skip torsions with periodicities of 0, which may be
        # present as a way to hack entries into the 1-4 pairlist. See
        # https://github.com/ParmEd/ParmEd/pull/145 for discussion. The solution
        # here is to simply set that periodicity to 1.
        inst._cleanup_dihedrals_with_periodicity_zero()
...

def _cleanup_dihedrals_with_periodicity_zero(self):
        """
        For torsions with only a single term and a periodicity set to 0, make sure pmemd still
        properly recognizes the necessary exception parameters. update_dihedral_exclusions will
        make sure that if a dihedral has a type pn0 *and* ignore_end is set to False (which means
        that it is required to specify exclusions), then it is the *only* torsion between those
        atoms in the system. This allows us to scan through our dihedrals, look for significant
        terms that have pn==0, and simply add another dihedral with pn=1 and k=0 to ensure that
        pmemd will always get that exception correct
        """
        new_dihedrals = []
        for dih in self.dihedrals:
            if dih.ignore_end or dih.type.per != 0:
                continue
            # If we got here, ignore_end must be False and out periodicity must be 0. So add
            # another dihedral
            dt = DihedralType(0, 1, 0, dih.type.scee, dih.type.scnb, list=self.dihedral_types)
            self.dihedral_types.append(dt)
            new_dihedrals.append(
                Dihedral(dih.atom1, dih.atom2, dih.atom3, dih.atom4, improper=dih.improper,
                         ignore_end=False, type=dt)
            )
            # Now that we added the above dihedral, we can start ignoring the end-group interactions
            # on this dihedral
            dih.ignore_end = True
        if new_dihedrals:
            self.dihedrals.extend(new_dihedrals)

I'll try to figure out where the zero periodicity is being set in our code. I'll also try setting the periodicity of null terms to one to see if that fixes things.

lohedges commented 3 years ago

Okay, I've fixed it. The solution was to set a default periodicity of one for the Sire.MM.AmberDihPart constructor here. Although I had tested for the ParmEd issue previously, the Sire.MM.AmberDihedral constructor wasn't even getting to the part where I was performing the check, since there were no terms, rather than a single term with zero periodicity. (By default we construct a Sire.MM.AmberDihedral with a single Sire.MM.AmberDihPart term with zero values for force constant, periodicity, and phase.)

The energies now agree for all systems that I've tested with pmemd. I'll just check our unit test suite to see if this has any knock-on affects, then push the update.

lohedges commented 3 years ago

This was a single character fix, i.e. 0 --> 1. Possibly the most tortuous debugging I have ever done, followed by the easiest fix.

jmichel80 commented 3 years ago

A war story for your memoir !

jmichel80 commented 3 years ago

@AdeleLip could you repeat your test MD sims on PTP1B and BACE with the new code once available?

AdeleHardie commented 3 years ago

Yes, I'll do that now and update