openmm / openmm-plumed

OpenMM plugin to interface with PLUMED
59 stars 23 forks source link

PLUMED files are not flushed correctly #92

Open stefdoerr opened 4 days ago

stefdoerr commented 4 days ago

This is not a super critical bug for me since I don't really need the PLUMED output files that much but I can see users which use PLUMED output files having issues with this.

PLUMED practically never flushes its file buffers on files until the whole process finishes (meaning your python session dies).

It's very consistent with XTC file outputs but it also happens with text files (like the ones we store the biases in) but there you can sometimes get it to work by using the FLUSH STRIDE argument although it's a bit hit or miss if the file will have all the lines in it.

plumed = PlumedForce(
    """\
DUMPATOMS ATOMS=@mdatoms STRIDE=100 TYPE=xtc FILE=plumed.xtc
FLUSH STRIDE=100"""
)
system.addForce(plumed)

simulation.context.reinitialize(preserveState=True)

for i in range(10):
    simulation.step(100)

traj = mdtraj.load_xtc("plumed.xtc", top="structure.prmtop")
assert traj.n_frames == 10

If you execute this you will get the following error:

(xdrfile error) Undocumented error 3Traceback (most recent call last):
  File "/home/sdoerr/Work/pyacemd/debug_plumed/./test.py", line 47, in <module>
    traj = mdtraj.load_xtc("plumed.xtc", top="structure.prmtop")
  File "mdtraj/formats/xtc/xtc.pyx", line 167, in mdtraj.formats.xtc.load_xtc
  File "mdtraj/formats/xtc/xtc.pyx", line 174, in mdtraj.formats.xtc.load_xtc
  File "mdtraj/formats/xtc/xtc.pyx", line 338, in mdtraj.formats.xtc.XTCTrajectoryFile.read_as_traj
  File "mdtraj/formats/xtc/xtc.pyx", line 407, in mdtraj.formats.xtc.XTCTrajectoryFile.read
  File "mdtraj/formats/xtc/xtc.pyx", line 484, in mdtraj.formats.xtc.XTCTrajectoryFile._read
RuntimeError: XTC read error: Compressed 3d coordinate

After the error you can read the file fine if you open a new python interpreter (after the first one dies) although sometimes get 9 frames instead of 10 so I guess even closing the process doesn't flush correctly.

Here is a reproducible example: plumed_file_flushing_bug.zip

peastman commented 4 days ago

This sounds like a bug in PLUMED, not the OpenMM plugin? We have no control over what it does with the files it opens.

stefdoerr commented 3 days ago

Ok! I thought maybe you had some idea on how to force it to close them by maybe adding some command which restarts the PLUMED executable, but if not it's fine, it doesn't really affect my use cases.

stefdoerr commented 2 days ago

Toni mentioned the plumed_finalize command to me which is called in the destructor of PlumedForceImpl https://github.com/openmm/openmm-plumed/blob/95bfd46d6499625de03ea2151aec42edeae5f662/openmmapi/src/PlumedForceImpl.cpp#L47

It makes sense that it needs to be called for everything to flush correctly.

I tried through python to delete the force object however it didn't work. I think the reason is that:

  1. There is no destructor in PlumedForce to call the corresponding destructor of PlumedForceImpl https://github.com/openmm/openmm-plumed/blob/95bfd46d6499625de03ea2151aec42edeae5f662/openmmapi/src/PlumedForce.cpp
  2. The python wrapper might also be missing the destructor call when del force is called https://github.com/openmm/openmm-plumed/blob/95bfd46d6499625de03ea2151aec42edeae5f662/python/plumedplugin.i
stefdoerr commented 2 days ago

Indeed this is the case. I made the following changes: image image image image image

Then in python I called this to delete the PlumedForce after my simulation

    forces = simulation.system.getForces()
    for i, force in enumerate(forces):
        if force.getName() == "PlumedForce":
            simulation.system.removeForce(i)
            del force
            break

Now it flushed fine. But once the python garbage collector was called I got a segmentation fault. I assume that somewhere in the OpenMM code it stores the PlumedForceImpl object and at the end the force objects are freed again and since I deleted that object early I got a segfault.

peastman commented 2 days ago

A ForceImpl is owned by a Context, not by a Force. You can create many Contexts for a single System. When a Context is deleted, it deletes all its ForceImpls.

https://github.com/openmm/openmm/blob/78c1536838c63125b77e48b170f6a352ae07506c/openmmapi/src/ContextImpl.cpp#L193-L195