michellab / BioSimSpace

Code and resources for the EPSRC BioSimSpace project.
https://biosimspace.org
GNU General Public License v3.0
77 stars 19 forks source link

Unexpected improper term #260

Closed kexul closed 2 years ago

kexul commented 2 years ago

Dear BioSimSpace developers: I'm turning a hydrogen atom into a methyl group as the following picture shown (the red box), but there is an improper term about the green-labeled atoms in the generated somd.pert file, is this an expected behavior?

企业微信截图_16396554213474

Part of the somd.pert file:

    improper
        atom0          N7
        atom1          C21
        atom2          N6
        atom3          H18
        initial_form   10.5000 2.0 3.141594
        final_form     1.1000 2.0 3.141594
    endimproper

Code & input file:

import BioSimSpace as BSS

mol1 = BSS.IO.readMolecules('TDA-INTER*').getMolecule(0)
mol2 = BSS.IO.readMolecules('TDA-028*').getMolecule(0)

merged = BSS.Align.merge(mol1, mol2)
solvated_system = BSS.Solvent.tip3p(merged, box=3*[5*BSS.Units.Length.nanometer], work_dir='solvation')

BSS.FreeEnergy.Relative(solvated_system, work_dir='free', engine='SOMD', setup_only=True)

test1.zip

lohedges commented 2 years ago

Yes, I believe so. Looking at the functions for the impropers, you can see that they are all the same except for one:

import BioSimSpace as BSS

m0 = BSS.IO.readMolecules("TDA-INTER_AC.*")[0]
m1 = BSS.IO.readMolecules("TDA-028_AC.*")[0]

p0 = m0._sire_object.property("improper").potentials()
p1 = m1._sire_object.property("improper").potentials()

f0 = [p.function() for p in p0]
f1 = [p.function() for p in p1]

f0
[1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 10.5 cos(2 phi - 3.14159) + 10.5]

f1
[1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1,
 1.1 cos(2 phi - 3.14159) + 1.1]

Note that the mapping of the entries in the two lists won't be correct, due to a potential difference in the order of the atoms in the two molecules. However, we can see that there are the same number of impropers in both of the molecules, and only one improper differs in its parameters, hence the single entry in the pert file.

kexul commented 2 years ago

Thanks, new lesson learned! I used to believe that only terms including perturbated atoms would appear in the .pert file

kexul commented 2 years ago

Hi @lohedges , there is a strange improper term when I use the following code,

    dihedral
        atom0          C5
        atom1          C6
        atom2          C7
        atom3          C8
        initial_form   0.1800 3.0 -0.000000 0.2000 1.0 3.141594 0.2500 2.0 3.141594
        final_form     0.1600 3.0 -0.000000
    enddihedral
mol1 = BSS.Parameters.gaff('C1C2CC3CC1CC(C2)C3').getMolecule()
mol2 = BSS.Parameters.gaff('C1CC2CCCC(C1)C2').getMolecule()
mapping = BSS.Align.matchAtoms(mol1, mol2)
merged = BSS.Align.merge(mol1, mol2, mapping=mapping, force=True)
solvated = BSS.Solvent.tip3p(merged, box=[5*BSS.Units.Length.nanometer]*3, is_neutral=True, work_dir='solvation')
protocol = BSS.Protocol.FreeEnergy(num_lam=16)
fep_solv = BSS.FreeEnergy.Relative(solvated, protocol, engine='somd', work_dir='solv', setup_only=True)
fep_vac = BSS.FreeEnergy.Relative(merged.toSystem(), protocol, engine='somd', work_dir='vac', setup_only=True)

I've tried running simulation with and without the unusual initial_form(just delete 0.2000 1.0 3.141594 0.2500 2.0 3.141594 manually), the energy output is the same.

Before delete:

Running minimisation.
Energy before the minimisation: 2170.13 kcal mol-1
Tolerance for minimisation: 1
Maximum number of minimisation iterations: 1000

==============================================================
Energy after the minimization: 60.5065 kcal mol-1
Energy minimization done.

After delete:

Running minimisation.
Energy before the minimisation: 2169.62 kcal mol-1
Tolerance for minimisation: 1
Maximum number of minimisation iterations: 1000

==============================================================
Sending anonymous Sire usage statistics to http://siremol.org.
For more information, see http://siremol.org/analytics
To disable, set the environment variable 'SIRE_DONT_PHONEHOME' to 1
To see the information sent, set the environment variable
SIRE_VERBOSE_PHONEHOME equal to 1. To silence this message, set
the environment variable SIRE_SILENT_PHONEHOME to 1.
==============================================================

Energy after the minimization: 60.0653 kcal mol-1
Energy minimization done.
jmichel80 commented 2 years ago

Track the GAFF atom types assigned to C5 C6 C7 C8 , they may well map to an improper with multiple periodicity terms. The single point energy tests many not be helpful, both forms could give a 0 potential energy contribution

lohedges commented 2 years ago

Yes, this is indeed the case:

import BioSimSpace as BSS

mol1 = BSS.Parameters.gaff('C1C2CC3CC1CC(C2)C3').getMolecule()
mol2 = BSS.Parameters.gaff('C1CC2CCCC(C1)C2').getMolecule()
mapping = BSS.Align.matchAtoms(mol1, mol2)

# Search for the atoms involved in the dihedral in mol1

idx1 = [mol1.search("atomname C5")[0].index(),
        mol1.search("atomname C6")[0].index(),
        mol1.search("atomname C7")[0].index(),
        mol1.search("atomname C6")[0].index()]

# Get the indices of the mapped atoms from mol2.
idx2 = [mapping[x] for x in idx1]

# Print the potential for the dihedral at lambda = 0
for p in mol1._sire_object.property("dihedral").potentials():
    if mol1._sire_object.atom(p.atom0()).index().value() in idx1 and \
       mol1._sire_object.atom(p.atom1()).index().value() in idx1 and \
       mol1._sire_object.atom(p.atom2()).index().value() in idx1 and \
       mol1._sire_object.atom(p.atom3()).index().value() in idx1:
       print(p.function())

0.2 cos(phi - 3.14159) + 0.25 cos(2 phi - 3.14159) + 0.18 cos(3 phi) + 0.63

# Print the potential for the dihedral at lambda = 1
for p in mol2._sire_object.property("dihedral").potentials():
    if mol2._sire_object.atom(p.atom0()).index().value() in idx2 and \
       mol2._sire_object.atom(p.atom1()).index().value() in idx2 and \
       mol2._sire_object.atom(p.atom2()).index().value() in idx2 and \
       mol2._sire_object.atom(p.atom3()).index().value() in idx2:
       print(p.function())

0.16 cos(3 phi) + 0.16
kexul commented 2 years ago

Ah, seems like a feature, not a bug, that's great to know!

ps: Wish all of you have a great holiday!

kexul commented 2 years ago

Got another error when saving the perturbable molecule created by the above code:

BSS.IO.savePerturbableSystem('merged', merged)
Problem creating the AmberParams object for molecule Merged_Molecule : 5. Errors include:
Have a 1-4 scaling factor (0.833333/0.5) between atoms H7:16 and H8:26 despite there being no physical dihedral between these two atoms. All 1-4 scaling factors MUST be associated with physical dihedrals. The shortest path is [  ]
Have a 1-4 scaling factor (0.833333/0.5) between atoms H8:17 and H8:26 despite there being no physical dihedral between these two atoms. All 1-4 scaling factors MUST be associated with physical dihedrals. The shortest path is [  ]
Have a 1-4 scaling factor (0.833333/0.5) between atoms H8:26 and H7:16 despite there being no physical dihedral between these two atoms. All 1-4 scaling factors MUST be associated with physical dihedrals. The shortest path is [  ]
Have a 1-4 scaling factor (0.833333/0.5) between atoms H8:26 and H8:17 despite there being no physical dihedral between these two atoms. All 1-4 scaling factors MUST be associated with physical dihedrals. The shortest path is [  ]
Thrown from FILE: /home/sireuser/Sire/corelib/src/libs/SireIO/moleculeparser.cpp, LINE: 1090, FUNCTION: QStringList pvt_write(const SireSystem::System&, const QStringList&, const QStringList&, const SireBase::PropertyMap&)
kexul commented 2 years ago

Got another error when saving the perturbable molecule created by the above code:

It's caused by some matches in the mapping, after manually deleting it, the saving could proceed.

mol1 = BSS.Parameters.gaff('C1C2CC3CC1CC(C2)C3').getMolecule()
mol2 = BSS.Parameters.gaff('C1CC2CCCC(C1)C2').getMolecule()
prematch = {5:0, 4:1, 0:7, 3:2, 2:8, 1:6, 9:3, 7:4, 8:5}
mapping = BSS.Align.matchAtoms(mol1, mol2, prematch=prematch)
if 6 in mapping:
    del mapping[6]
mol1 = BSS.Align.rmsdAlign(mol1, mol2, mapping)
merged = BSS.Align.merge(mol1, mol2, mapping=mapping, force=True)

BSS.IO.savePerturbableSystem('merged', merged)
lohedges commented 2 years ago

The error is being triggered here. The issue is that the connectivity property that is associated with the original AMBER format molecules prior to the merge is inconsistent with the bonding that after the merge. This is what the error is telling you when you try to perform the merge without the force=True option. If I change the Sire code so that it's forced to re-generate the connectivity, i.e to use the bonding in the merged molecule, then the merge completes without error.

//if (not has_connectivity)
{
    //have to use the connectivity that is implied by the bonds
    QMutexLocker lkr(&mutex);
    //if (not has_connectivity)
    {
        conn = this->connectivity();
        has_connectivity = true;
    }
}

//find the shortest bonded path between these two atoms
const auto path = conn.findPath(atm0,atm3,4);

if (path.count() != 4)
{
    // Could re-generate connectivity here and re-test the path count.

    QMutexLocker lkr(&mutex);
    errors.append( QObject::tr("Have a 1-4 scaling factor (%1/%2) "
        "between atoms %3:%4 and %5:%6 despite there being no physical "
        "dihedral between these two atoms. All 1-4 scaling factors MUST "
        "be associated with "
        "physical dihedrals. The shortest path is %7")
        .arg(s.coulomb()).arg(s.lj())
        .arg(molinfo.name(atm0).value()).arg(atm0.value())
        .arg(molinfo.name(atm3).value()).arg(atm3.value())
        .arg(Sire::toString(path)) );
    continue;
}

Whether the result is valid is another question. I'd need to perform more checks to ensure that this approach is okay. If it is, then we could add an additional conditional block inside the path length check that re-generates the connectivity in the event that the first check fails. (We'd probably also need to update the connectivity elsewhere, too.)

Thanks for all your help and testing this year. I'm off for the holidays now and will be unlikely to check things regularly until the New Year, Happy holidays to you too!

lohedges commented 2 years ago

Actually, the same error is still triggered, I was just running an incorrect version of my test script :blush: I think this means that the bonding (and connectivity) of the merged molecule doesn't make sense somehow. Will take a closer look when I'm back. Cheers!