ISISNeutronMuon / MDANSE

MDANSE: Molecular Dynamics Analysis for Neutron Scattering Experiments
https://www.isis.stfc.ac.uk/Pages/MDANSEproject.aspx
GNU General Public License v3.0
19 stars 4 forks source link

Automatic molecule labels from MoleculeFinder #392

Closed MBartkowiakSTFC closed 3 months ago

MBartkowiakSTFC commented 3 months ago

Description of work Some analysis types (AreaPerMolecule, AngularCorrelation) require the molecules in the system to have labels in order to be selected. However, not every MD engine assigns labels to molecules.

Fixes

  1. MoleculeFinder now assigns InChI strings to molecules it detects.
  2. AngularCorrelation now automatically sets axis to be from the first atom of the molecule to the molecule centre of mass.
  3. AreaPerMolecule now selects unit cell vectors defining the wall surface area as one of the three possibilities: 'ab', 'bc' or 'ac'.
  4. New widgets have been added (MultipleCombosWidget, VectorWidget) to allow running OrderParameter in the future.

To test

  1. Run AngularCorrelation and AreaPerMolecule on a DL_POLY trajectory with labeled molecules.
  2. Create labels in a trajectory from some other engine (e.g. CP2K) by running MoleculeFinder on it.
  3. Run AngularCorrelation and AreaPerMolecule on the new trajectory.
ChiCheng45 commented 3 months ago

Looks good but a few points.

The angular correlation results can depend on whether it was run with a trajectory which had been modified with moleculefinder or not. This looks like it is because in some cases the coordinates are not contiguous so the results will be different. Maybe the atoms positions should be are changed to be contiguous during the calculation?

Would it be a good idea to set the axis in the angular correlation to something physical since at the moment the results will depend the atom ordering. Maybe we can change it to the molecules axis of rotation and we'd then have angular correlation results for the three molecular axes.

Molecular finder can crash when it is used with some bulk system like with the p1 cp2k system from MDANSE-Examples. I got the following error.

Traceback (most recent call last):
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 313, in run
    self.initialize()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\MoleculeFinder.py", line 82, in initialize
    inchistring = moltester.identify_molecule()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Chemistry\Structrures.py", line 55, in identify_molecule
    rdDetermineBonds.DetermineBonds(mol_object, charge=0)
Boost.Python.ArgumentError: Python argument types in
    rdkit.Chem.rdDetermineBonds.DetermineBonds(NoneType)
did not match C++ signature:
    DetermineBonds(class RDKit::ROMol {lvalue} mol, bool useHueckel=False, int charge=0, double covFactor=1.3, bool allowChargedFragments=True, bool embedChiral=True, bool useAtomMap=False, bool useVdw=False)  

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\xcb63893\AppData\Local\anaconda3\envs\MDANSE5\Lib\multiprocessing\process.py", line 314, in _bootstrap
    self.run()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE_GUI\Src\MDANSE_GUI\Subprocess\Subprocess.py", line 46, in run
    self._job_instance.run(self._job_parameters)
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 335, in run
    raise JobError(self, tb)
MDANSE.Framework.Jobs.IJob.JobError: Traceback (most recent call last):
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\IJob.py", line 313, in run
    self.initialize()
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Framework\Jobs\MoleculeFinder.py", line 82, in initialize
    inchistring = moltester.identify_molecule()
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\xcb63893\PycharmProjects\MDANSE\MDANSE\Src\MDANSE\Chemistry\Structrures.py", line 55, in identify_molecule
    rdDetermineBonds.DetermineBonds(mol_object, charge=0)
Boost.Python.ArgumentError: Python argument types in
    rdkit.Chem.rdDetermineBonds.DetermineBonds(NoneType)
did not match C++ signature:
    DetermineBonds(class RDKit::ROMol {lvalue} mol, bool useHueckel=False, int charge=0, double covFactor=1.3, bool allowChargedFragments=True, bool embedChiral=True, bool useAtomMap=False, bool useVdw=False)  
MBartkowiakSTFC commented 3 months ago

I managed to address two of the points you raised:

  1. Extra checks have been added to stop the MoleculeFinder from crashing on more tricky trajectories. CP2K P1 trajectory works now.
  2. AngularCorrelation now always uses contiguous coordinates. The current implementation is slow, but the results should now be independent of the way the trajectory was converted (i.e. with and without coordinate folding).

Also, I improved the docstring of MoleculeFinder.

ChiCheng45 commented 3 months ago

Looks good, works on my end. I've open a proposal for some enhancements to molecule related jobs in #399 which include the axis proposal from my comment above.