openmm / openmm-ml

High level API for using machine learning models in OpenMM simulations
Other
76 stars 25 forks source link

Issue with StateDataReporter #11

Closed JMorado closed 3 years ago

JMorado commented 3 years ago

Hi,

Wonderful tool! I have been using it to run simulations of ligands in aqueous solution and so far so good.

There is one small issue, however, that I have been facing, which has been preventing me to obtain important data from the simulations. Whenever I include the following line to print properties of the system:

sim.reporters.append(mm.app.StateDataReporter(stdout, 1000, step=True, time=True, potentialEnergy=True, kineticEnergy=True, totalEnergy=True, temperature=True, volume=True, density=True))

I get this error:

` Traceback (most recent call last): File "ani.py", line 52, in sim.step(10000) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 132, in step self._simulate(endStep=self.currentStep+steps) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 234, in _simulate self._generate_reports(wrapped, True) File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/app/simulation.py", line 252, in _generate_reports state = self.context.getState(getPositions=getPositions, getVelocities=getVelocities, getForces=getForces, File "/home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/openmm.py", line 4888, in getState state = _openmm.Context_getState(self, types, enforcePeriodicBox, groups_mask) simtk.openmm.OpenMMException: The autograd engine was called while holding the GIL. If you are using the C++ API, the autograd engine is an expensive operation that does not require the GIL to be held so you should release it with 'pybind11::gil_scoped_release no_gil;'. If you are not using the C++ API, please report a bug to the pytorch team. Exception raised from execute at /tmp/pip-req-build-2handpz9/torch/csrc/autograd/python_engine.cpp:111 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::cxx11::basic_string<char, std::char_traits, std::allocator >) + 0x6a (0x2b9fba3dcdba in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0xd8 (0x2b9fba3d9338 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libc10.so) frame #2: torch::autograd::python::PythonEngine::execute(std::vector<torch::autograd::Edge, std::allocator > const&, std::vector<at::Tensor, std::allocator > const&, bool, bool, bool, std::vector<torch::autograd::Edge, std::allocator > const&) + 0x122 (0x2ba019cb5452 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_python.so) frame #3: + 0x2e900f2 (0x2b9f798150f2 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #4: torch::autograd::backward(std::vector<at::Tensor, std::allocator > const&, std::vector<at::Tensor, std::allocator > const&, c10::optional, bool, std::vector<at::Tensor, std::allocator > const&) + 0x6a (0x2b9f79815afa in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #5: + 0x3497f56 (0x2b9f79e1cf56 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #6: at::Tensor::_backward(c10::ArrayRef, c10::optional const&, c10::optional, bool) const + 0x1c9 (0x2b9f77ed9db9 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #7: TorchPlugin::OpenCLCalcTorchForceKernel::execute(OpenMM::ContextImpl&, bool, bool) + 0x8c1 (0x2ba00ad71941 in /home/jm4g18/miniconda3/envs/torchml/lib/plugins/libOpenMMTorchOpenCL.so) frame #8: OpenMM::ContextImpl::calcForcesAndEnergy(bool, bool, int) + 0xca (0x2b9f5c39ac8a in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/../../../../libOpenMM.so.7.5) frame #9: OpenMM::Context::getState(int, bool, int) const + 0x15a (0x2b9f5c39978a in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/../../../../libOpenMM.so.7.5) frame #10: + 0x169148 (0x2b9f5c168148 in /home/jm4g18/miniconda3/envs/torchml/lib/python3.9/site-packages/simtk/openmm/_openmm.cpython-39-x86_64-linux-gnu.so)

frame #34: __libc_start_main + 0xf5 (0x2b9f553c4555 in /lib64/libc.so.6) ` Something seems to go wrong when using the getState method of Context, independently of the platform I use. Any idea of what is going on? Thank you! Best, João
JMorado commented 3 years ago

The problem is now fixed as the most recent OpenMM version, 7.5.1, releases GIL when calling getState().

Thanks, João