mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
173 stars 81 forks source link

Loading the output (gsd file) of a HOOMD simulation in mbuild #759

Closed oakif closed 4 years ago

oakif commented 4 years ago

I'm creating a polymer in mbuild, running a simulation on it to obtain a relaxed conformation, and want to input it into mbuild again to create a macrostructure. The problem is that I can't find a good way to import a gsd file into mbuild.

mbuild seems to load only the following file formats:

arc dcd binpos xtc trr hdf5 h5 ncdf netcdf nc pdb.gz pdb lh5 crd mdcrd inpcrd restrt rst7 ncrst lammpstrj dtr stk gro xyz.gz xyz tng xml mol2 hoomdxml

Some of them don't have topology information, so a file containing this information is required which may have the following formats according to mbuild:

pdb pdb.gz h5 lh5 prmtop parm7 psf mol2 hoomdxml gro arc hdf5

HOOMD outputs as:

  1. gsd, which is not in this list
  2. dcd, but it doesn't contain topological information
  3. hoomdxml but this is deprecated

If I go with 1 (export from HOOMD as ssd), I could load and re-export it using OVITO, which supports exports to the following formats:

ca ca.gz gsd lammpstrj nc pov vtk xyz

However, of the formats supported for loading (lammpstrj and xyz), it seems like I need a topology file as well.

So that bring us to 2, which requires a topology file. I could possibly export the structure of the unrelaxed polymer from mbuild, but it seems to support exports only to the following:

hoomdxml gsd xyz lammps lmp par

Only hoomxml is the filetype accepted as a topology file by mbuild and contains topological information (I presume). However, when I try to export a hoomdxml using mbuild, I get the following error:

mb.formats.hoomdxml.write_hoomdxml(filename='systems/polymer.hoomdxml', structure=system)
/home/oakif/.conda/envs/hoomd-tf/lib/python3.6/site-packages/mbuild/utils/decorators.py:21: Warning: write_hoomdxml has breaking change. See PR#463 on github
  warn(printed_message, Warning)
Traceback (most recent call last):
  File "polymer_builder.py", line 263, in <module>
    mb.formats.hoomdxml.write_hoomdxml(filename='systems/polymer.hoomdxml', structure=system) # for topology
  File "/home/oakif/.conda/envs/hoomd-tf/lib/python3.6/site-packages/mbuild/utils/decorators.py", line 22, in wrapper
    fcn(*args, **kwargs)
  File "/home/oakif/.conda/envs/hoomd-tf/lib/python3.6/site-packages/mbuild/formats/hoomdxml.py", line 124, in write_hoomdxml
    _write_rigid_information(xml_file, rigid_bodies)
  File "/home/oakif/.conda/envs/hoomd-tf/lib/python3.6/site-packages/mbuild/formats/hoomdxml.py", line 322, in _write_rigid_information
    if not all(body is None for body in rigid_bodies):
TypeError: 'NoneType' object is not iterable
srun: error: bhd0040: task 0: Exited with exit code 1
srun: Terminating job step 14822791.0

I'm not sure what this is about, but I feel like there should be an easier way to achieve what I'm after rather than trying to work around this error.

As for method 3 (exporting as hoomdxml from HOOMD directly, to preserve positions and topology), unfortunately dumping as hoomdxml seems to be deprecated. I get the following error when trying anyway, so it seems like the module was removed:

Traceback (most recent call last):
  File "run_simulation.py", line 166, in <module>
    hoomd.deprecated.dump.xml(
AttributeError: module 'hoomd' has no attribute 'deprecated'

Some other things I tried/looked at:

oakif commented 4 years ago

Upon some further investigation, it seems like if I use VMD to export to a .gro file, it'll load in mbuild. However, VMD doesn't preserve topology. All the other exports from VMD that I've tested (dtr binpos crd laampstrj mol2 rst7 trr) require a topology file to be loaded as well.

So it seems like whatever I do, I need to capture the topology in a non-gsd file. I'm not sure how to do this.

My immediate idea would be to export my polymer from mbuild into a trajectory format which contains topological information, and then use that as my topology file after trying to import the HOOMD simulation. The issue is that none of the mbuild outputs (hoomdxml gsd xyz lammps lmp par) except hoomdxml can be used as a trajectory. And I'm getting the issue mentioned above when saving to hoomdxml. I'll try to investigate this further.

ahy3nz commented 4 years ago

Easiest to hardest:

1) Use mdtraj

traj = mdtraj.load('mygsd.gsd')
cmpd = mb.load(traj, frame=-1)

Additional kwargs here

2) Use mdanalysis to load your gsd file, then you'd have to figure out your own conversion from an mdanalysis topology

3) The manual conversion you mention could be a last resort if neither of these two libraries is exactly what you need

oakif commented 4 years ago

@ahy3nz Thanks a lot for the tips, Alex! I tried your first one, hoping that it would work. Unfortunately ~mbuild~ MDTraj seems to see the trajectory as an (unrecognizable) gsd file:

Traceback (most recent call last):
  File "polymer_builder.py", line 226, in <module>
    traj = md.load('systems/polymer_relaxed.gsd')
  File "/home/oakif/.conda/envs/hoomd-tf/lib/python3.6/site-packages/mdtraj/core/trajectory.py", line 409, in load
    'with extensions in %s' % (filename, extension, FormatRegistry.loaders.keys()))
OSError: Sorry, no loader for filename=systems/polymer_relaxed.gsd (extension=.gsd) was found. I can only load files with extensions in dict_keys(['.arc', '.dcd', '.binpos', '.xtc', '.trr', '.hdf5', '.h5', '.ncdf', '.netcdf', '.nc', '.pdb.gz', '.pdb', '.lh5', '.crd', '.mdcrd', '.inpcrd', '.restrt', '.rst7', '.ncrst', '.lammpstrj', '.dtr', '.stk', '.gro', '.xyz.gz', '.xyz', '.tng', '.xml', '.mol2', '.hoomdxml'])

So I guess I'm out of luck with MDTraj. I'll try out the other methods and see how they go! Maybe in the future we can work on a pull request to add in the gsd functionality. I think it shouldn't be too hard; I'll need to look into it a bit more though.

oakif commented 4 years ago

@ahy3nz Okay, I feel quite silly but I found your gsd.py and realized that my MDTraj package was out of date. I updated it and mbuild can now read gsds normally. Thanks for the help in figuring that one out!