michellab / BioSimSpace_examples

GNU General Public License v2.0
0 stars 0 forks source link

MCS mapping behavior #10

Open jmichel80 opened 6 years ago

jmichel80 commented 6 years ago

The script

BioSimSpace_examples/scripts/test_mapping.py

generates a mapping for methane <--> methanol.

For efficient implementation in most MD packages the mapping code should be modified such that one H in methane maps to the oxygen atom of methanol.

jmichel80 commented 6 years ago

Disabling the heavy atoms pre-match raises an exception on this example

Traceback (most recent call last): File "../test_mapping.py", line 12, in mapping = BSS.Align.matchAtoms(m0, m1, verbose=True, match_light=False) File "/home/julien/sire.app/lib/python3.5/site-packages/BioSimSpace-2018.1.1+360.g41af51d.dirty-py3.5.egg/BioSimSpace/Align/init.py", line 188, in matchAtoms File "/home/julien/sire.app/lib/python3.5/site-packages/BioSimSpace-2018.1.1+360.g41af51d.dirty-py3.5.egg/BioSimSpace/Align/init.py", line 377, in _score_rmsd UserWarning: Exception 'SireError::program_bug' thrown by the thread 'master'. Attempt to find the alignment matrix failed to produce a valid rotation matrix. Determinant should be 1. It is equal to 0. Thrown from FILE: /home/julien/software/devel/Sire/corelib/src/libs/SireMaths/align.cpp, LINE: 508, FUNCTION: SireMaths::Matrix SireMaths::kabasch(const QVector&, const QVector&)

jmichel80 commented 6 years ago

Replacing in findMCS.cpp all instances of

        put(user_match_0, nvert0, -2);

by put(user_match_0, nvert0, -1);

does not seem to change the output of

mapping = BSS.Align.matchAtoms(m0, m1, verbose=True)

jmichel80 commented 6 years ago

Correction) findMCS.cpp was incorrectly compiled. The above changes give the desired behaviour of matching one CH4 H to the CH3OH O

lohedges commented 6 years ago

This is now fixed in the feature-mapping branch of Sire, and in the latest commit in feature-align of BioSimSpace.

jmichel80 commented 6 years ago

The output of the merging could be improved by using the atom names of molecule0 in merged0 and of molecule1 in merged1.

This shows below the visualisation of aligned.pdb with vmd for the perturbation methane --> toluene. The carbon atom bonded to the methyl group is labelled.

image

This shows the visualisation of merged0.pdb coloured by the Element type. This is ok to visualise, and it is clear what the dummy atoms are.

image

This shows the visualisation of merged1.pdb

image

The atom labelled H has the correct element. The original label in mol1 was C but this has not been preserved in the mapping.

Ideally the desired behaviour (keeping atom names of non dummy atoms of mol0 in merged0, or atom names of non dummy atoms of mol1 in merged1) could be obtained with syntax like this:

map0 = {"name" : "name0", "coordinates" : "coordinates0", "charge" : "charge0", "element" : "element0" }

map1 = {"name" : "name1", "coordinates" : "coordinates1", "charge" : "charge1", "element" : "element1" }

merged = BSS.Align.merge(m0, m1, mapping=mapping) BSS.IO.saveMolecules('merged0.pdb', merged,'PDB', map=map0) BSS.IO.saveMolecules('merged1.pdb', merged, 'PDB', map=map1)

Unless I'm mistaken name is not currently used in the dictionary passed to BSS.IO.saveMolecule()

lohedges commented 6 years ago

Hi Julien,

You can use map whatever properties you want. To see what's available, type:

merged._sire_molecule.propertyKeys()

The problem is that the atom names are not properties, rather Sire.Mol.AtomName objects, so they cannot be changed using the map. (Each atom in the molecule as a single AtomName, but multiple properties.) To get the name of an atom in the molecule you can do:

from Sire.Mol import AtomIdx

print(merged._sire_molecule.atom(AtomIdx(0)).name())

Because of this, the names aren't correctly preserved by the merge and are inconsistent at lamda = 0 and lambda = 1. This shouldn't be a problem for Gromacs / SOMD since the atoms are named according to the ambertype property, i.e. ambertype0 and ambertype1, but it makes it a pain for visualising a PDB with VMD since the PDB parser just uses the AtomName.

I'll think about this a littler harder and see if I can come up with a solution. (And check that things are correct for the Gromacs case.)

lohedges commented 6 years ago

Just a quick update. Gromacs and SOMD are okay since we only care about the names of the atoms at lambda = 0. The "atomtype" and "ambertype" properties are what's used by the internal programs for the purposes of performing the perturbation. (The names could be anything).

To make any other SireIO parser able to write a merged molecule with consistent atom names at lambda = 1 I would need to make it aware of the additional merged molecule properties, i.e. extract the "atomname0" or "atomname1" property if the molecule is perturbable, rather using the default AtomName. The user could pass an optional { "lambda" : "1" } using the PropertyMap if they wanted to write the end state. This would be easy enough to do as a one-off for visualisation purposes, e.g., using the PDB2 parser, but might be a pain to implement for everything else.

lohedges commented 6 years ago

As a quick fix I've added two additional properties to the merged molecule that records the AtomName at lambda = 0 and lambda = 1. The PDB parser is aware of these properties, so you can add them to your map in order to recover the correct names at lambda = 1. (By default, the names at lambda = 0 are used.)

The following should now work, although you'll need to rebuild the latest feature-mapping branch of Sire for the changes to take effect. (You'll also need to pull the latest changes from feature-align of BioSimSpace.)

map0 = {"name" : "name0",
"coordinates" : "coordinates0",
"charge" : "charge0",
"element" : "element0" }

map1 = {"name" : "name1",
 "coordinates" : "coordinates1",
 "charge" : "charge1",
 "element" : "element1" }

merged = BSS.Align.merge(m0, m1, mapping=mapping)
BSS.IO.saveMolecules('merged0.pdb', merged,'PDB', map=map0)
BSS.IO.saveMolecules('merged1.pdb', merged, 'PDB', map=map1)

When you get a chance, could you give this a try and let me know if you run into any issues.

Cheers.