MolSSI / mmic

Molecular Mechanics Interoperable Components
https://mm-portal.netlify.app/mmic
BSD 3-Clause "New" or "Revised" License
3 stars 2 forks source link

A bug for reading data. (Tracked to compute() func in base_component.py) #4

Closed RlyehAD closed 2 years ago

RlyehAD commented 3 years ago

Hi everyone, I was updating the component mmic_optim_gmx and I encountered a bug about the hash value in Molecule object. This bug is from from_file() function, which is an inherent function of Molecule. After energy minimization with Gromacs (implemented by mmic_cmd), the written .gro file should be read into mmic schema. However, if mmic_parmed is used as a translator for this step, the following error is raised:

Traceback (most recent call last):        
File "/home/******/Project/MMIC/lab/01_Issue/to_file_error_01.py", line 17, in <module> error_repeat_compute.compute(inp)
File "/home/******/miniconda3/envs/mmic/lib/python3.9/site-packages/mmic/components/base/base_component.py", line 74, in compute
cls.output(**exec_output.dict())
File "pydantic/main.py", line 406, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for error_output 
molecule
Model data inconsistent with stored hash code! (type=assertion_error) 

Right now, my forked mmic_optim_gmx can pass all the tests because I specified the translator as mmic_mda in this step. The reason that this issue is opened here instead of mmic_parmed is it relates to the specified execute() function. Writing a .gro file from mmic schema, running some Gromacs commands and reading the produced .gro to mmic schema will not generate this error. This is proved by running the script simple.py (this script uses GromacsWrapper). However, if the .gro file is converted in a specified execute() function, the error appears. This can be repeated by running the script to_file_error_01.py (please make sure to_file_error_02.py is put in the same dir). To prove that this issue has nothing to do with Gromacs, the related code in to_file_error_02.py has been commented. Therefore its function is only to convert a .gro file to mmic schema using Molecule.from_file(). The error can be still repeated.

tofile_error_scripts.zip To run the scripts here, please make sure you have installed the related mmic components. Please run 'python to_file_error_01.py' in a terminal to repeat the error. This does not require GromacsWrapper.

The code in simple.py, to_file_error_01.py, and to_file_error_02.py is posted here in case it's inconvenient for you to download the zip file. simple.py

import mmelemental
from mmelemental.models import Molecule
import tempfile
import mm_data
import gromacs

mol = mmelemental.models.Molecule.from_file(mm_data.mols["water-mol.json"])
gro_file = tempfile.NamedTemporaryFile(suffix=".gro").name
box_gro_file = tempfile.NamedTemporaryFile(suffix=".gro").name
mol.to_file(gro_file, translator="mmic_parmed") 

gromacs.editconf(f=gro_file, d=2, o=box_gro_file)

mol = Molecule.from_file(box_gro_file, translator="mmic_parmed")

to_file_error_01.py

import mmelemental 
import tempfile
import mm_data
import to_file_error_02
from to_file_error_02 import error_repeat_compute

mol = mmelemental.models.Molecule.from_file(mm_data.mols["water-mol.json"])
gro_file = tempfile.NamedTemporaryFile(suffix=".gro").name
mol.to_file(gro_file, translator="mmic_parmed")

inp = to_file_error_02.error_input(
    molecule = gro_file,
    schema_name = "error",
    schema_version = 1.0,
    )

error_repeat_compute.compute(inp)

to_file_error_02.py

import mmelemental
from mmelemental.models import Molecule
from cmselemental.models.procedures import InputProc, OutputProc
import tempfile
import mm_data
from pydantic import Field
from typing import Dict, Optional, List, Tuple
import gromacs
from mmic.components.blueprints import GenericComponent
from cmselemental.util.decorators import classproperty

__all__ = ["error_repeat_compute"]

class error_input(InputProc):

    molecule: str = Field(None, description="molecule used for repeating the error")

class error_output(OutputProc):

    molecule: Molecule = Field(
        None, description="molecule used for repeating the error"
    )

class error_repeat_compute(GenericComponent):
    @classproperty
    def input(cls):
        return error_input

    @classproperty
    def output(cls):
        return error_output

    @classproperty
    def version(cls) -> str:
        """Finds program, extracts version, returns normalized version string.
        Returns
        -------
        str
            Return a valid, safe python version string.
        """
        return ""

    def execute(
        self,
        inputs: error_input,
        extra_outfiles: Optional[List[str]] = None,
        extra_commands: Optional[List[str]] = None,
        scratch_name: Optional[str] = None,
        timeout: Optional[int] = None,
    ) -> Tuple[bool, error_output]:

        if isinstance(inputs, dict):
            inputs = self.input(**inputs)

        mol_file = inputs.molecule
        #boxed_gro_file = tempfile.NamedTemporaryFile(suffix=".gro").name
        #gromacs.editconf(f=mol_file, d=2, o=boxed_gro_file)
        #mol = Molecule.from_file(boxed_gro_file, translator="mmic_parmed")
        mol = Molecule.from_file(inputs.molecule, translator="mmic_parmed")

        return (True, error_output(molecule=mol, success=True, schema_name = "error", schema_version =1.0))
anabiman commented 2 years ago

Thanks for the report. The bug is in mmic_parmed, so I have moved the issue here.