choderalab / openmm

OpenMM is a toolkit for molecular simulation using high performance GPU code.
1 stars 0 forks source link

ff14ipq - energy validation fails due to a rogue dihedral #4

Closed rafwiewiora closed 8 years ago

rafwiewiora commented 8 years ago

@jchodera @swails

ff14ipq has been failing my energy validation on the Propers, by around 1.3 kJ/mol, which is about 5e-3 relative difference in this case (and I'm working with 1e-5 tolerance). This was only with villin, not ala_ala_ala.

I've tracked it down to just one problematic dihedral in proline. This is how I did it just to make sure this workflow didn't introduce any further bias:

parm = parmed.load_file('villin.top', 'villin.crd')
parm = parmed.amber.AmberParameterSet.from_structure(parm)
parm = parmed.omm.OpenMMParameterSet.from_parameterset(parm)
parm.write('test.xml')

And I caught this:

 <Proper type1="TN" type2="CX" type3="C" type4="N" periodicity1="4" phase1="0.0" k1="-1.10357184" periodicity2="3" phase2="3.14159265359" k2="-1.97618688" periodicity3="2" phase3="3.14159265359" k3="6.13868112" periodicity4="1" phase4="3.14159265359" k4="0.2600356"/>
<Proper type1="TN" type2="CX" type3="C" type4="N" periodicity1="1" phase1="3.141594" k1="0.2600356"/>

In the parm14ipq.dat (start line 2048):

N -CX-C -TN   1     0.14302    0.0 -4.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
TN-CX-C -N    1    -0.26376    0.0 -4.0  Branched from N -CX-C -N , Fit by mdgx 
TN-CX-C -TN   1     0.04031    0.0 -4.0  Branched from N -CX-C -N 
N -CX-C -TN   1     0.12378  180.0 -3.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
TN-CX-C -N    1    -0.47232  180.0 -3.0  Branched from N -CX-C -N , Fit by mdgx 
TN-CX-C -TN   1     0.06853  180.0 -3.0  Branched from N -CX-C -N 
N -CX-C -TN   1     0.98860  180.0 -2.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
TN-CX-C -N    1     1.46718  180.0 -2.0  Branched from N -CX-C -N , Fit by mdgx 
TN-CX-C -TN   1     0.19829  180.0 -2.0  Branched from N -CX-C -N 
N -CX-C -TN   1    -0.17826  180.0  1.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
TN-CX-C -N    1     0.06215  180.0  1.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
TN-CX-C -TN   1     1.46258  180.0  1.0  Branched from N -CX-C -N 

* So we've got - dihedral written in both directions with different params (N - CX - C - TN, TN - C - CX - N) - what is the meaning of this?

TN-CX-C -N    1     0.06215  180.0  1.0  Branched from N -CX-C -N , Fit by mdgx , Fit by mdgx , Fit by mdgx
swails commented 8 years ago

I'm not sure why ff14IPQ doesn't put the terms in order like every other parameter file does. I think tleap will read these correctly, but I'm not sure. Does the tleap-generated prmtop have 4 terms for dihedral types N-CX-C-TN? Or just 1? I know the guy that makes these files, so I can complain to him if leap is bungling this up. I'm pretty sure I know what ParmEd does (it loads in each of them as 4-term torsions, even though the type definitions are not consecutive like every other file). Maybe tleap doesn't do this...

So we've got - dihedral written in both directions with different params (N - CX - C - TN, TN - C - CX - N) - what is the meaning of this?

I don't see that in the text you posted... I see N-CX-C-TN and TN-CX-C-N (note these permute types 1 and 4 but not 2 and 3). Or is the other term printed somewhere else that you didn't report?

rafwiewiora commented 8 years ago

I don't see that in the text you posted... I see N-CX-C-TN and TN-CX-C-N (note these permute types 1 and 4 but not 2 and 3). Or is the other term printed somewhere else that you didn't report?

Oh yes, thanks! Too much staring at this here for me clearly. Ok so this is a smaller problem, good.

I'm not sure why ff14IPQ doesn't put the terms in order like every other parameter file does. I think tleap will read these correctly, but I'm not sure. Does the tleap-generated prmtop have 4 terms for dihedral types N-CX-C-TN? Or just 1? I know the guy that makes these files, so I can complain to him if leap is bungling this up. I'm pretty sure I know what ParmEd does (it loads in each of them as 4-term torsions, even though the type definitions are not consecutive like every other file). Maybe tleap doesn't do this...

Can you help me out look through the prmtop please? https://gist.github.com/rafwiewiora/388c7571f3fb12c28bee I don't quite understand the structure of it, particularly where parameters for which angle sit given this with H - without H separation.

But if I just ctrl+f it: 6.21500000E-02 (per 1) is there in force constants, but 1.46718 (per 2) is not. So looks like the other periodicities are skipped / overwritten by tleap.

know the guy that makes these files, so I can complain to him if leap is bungling this up. I'm pretty sure I know what ParmEd does (it loads in each of them as 4-term torsions, even though the type definitions are not consecutive like every other file).

Yeah, ParmEd waits for the positive periodicity number. So are you saying if tleap is not getting all 4 periodicities it is doing a wrong thing here and hence I should just ignore the energy fail here?

rafwiewiora commented 8 years ago

But if I just ctrl+f it: 6.21500000E-02 (per 1) is there in force constants, but 1.46718 (per 2) is not. So looks like the other periodicities are skipped / overwritten by tleap.

Same for the other constants (per 3, 4) - not there.

swails commented 8 years ago

Can you help me out look through the prmtop please?

Easiest thing to do that I can think of (and this is what I would do), is use ParmEd to dump an frcmod file and then compare the parameters in there to the parameters in the parm.dat files.

$ parmed -p <prmtop>
writeFrcmod test.frcmod

It should be immediately clear whether or not that prmtop has all 3 terms. But I suspect they may not be there based on what you've seen.

rafwiewiora commented 8 years ago

Cool!

Ok, I hope my eyes are not tricking me again, but this is weird, in frcmod:

TN-CX-C -N     1     0.06215000  180.000   1.0    SCEE=1.2 SCNB=2.0

Only this one for this.

BUT

N -CX-C -TN    1    -0.17826000  180.000  -1.0    SCEE=1.2 SCNB=2.0
N -CX-C -TN    1     0.98860000  180.000  -2.0    SCEE=1.2 SCNB=2.0
N -CX-C -TN    1     0.12378000  180.000  -3.0    SCEE=1.2 SCNB=2.0
N -CX-C -TN    1     0.14302000    0.000   4.0    SCEE=1.2 SCNB=2.0

All 4 here. But if you look at the fragment of the dat in my first post - both of these angles are mixed up together. Why are they behaving differently in tleap?

swails commented 8 years ago

It's been a long time since I've looked at this part of the tleap code (and I don't recall fully understanding it then, either), but I don't think that tleap really handles this situation too well. It seems that of the intermixed multi-term definitions, only the first one is getting all of its terms filled in. The others only get the last term assigned.

Clearly this is an assumption in tleap and there is no warning when the assumption is violated. I'll let Dave know.

swails commented 8 years ago

You can go ahead and punt on this. It's a bug in parm14ipq.dat.

rafwiewiora commented 8 years ago

Great, thanks! For now, I'm just going to increase the tolerance for 14ipq propers to 1e-2.

rafwiewiora commented 8 years ago

You can go ahead and punt on this. It's a bug in parm14ipq.dat.

Whoops didn't see this, ok - I'm not converting 14ipq for now until we get a clear answer on this. @jchodera this ok with you?

jchodera commented 8 years ago

Yeah, we will skip this for now.