mosdef-hub / mbuild

A hierarchical, component based molecule builder
https://mbuild.mosdef.org
Other
171 stars 79 forks source link

Remove deprecated formats, update reader and writer backends #1191

Open chrisjonesBSU opened 1 month ago

chrisjonesBSU commented 1 month ago

PR Summary:

This PR gets things started to remove the deprecated items talked about in #1187

Also, this will update the reader/writer backends to use GMSO where ever possible.

PR Checklist


chrisjonesBSU commented 1 month ago

The failing tests are in test_xyz.py. The behavior we expect in the test is not what we get when using the gmso backend workflow. If we load a single molecule with infer_hierarchy = True do we expect the particles of the created compound to be children (the way the test is currently written), or do we expect a flat molecule (what we get from gmso)?

Also, if I change infer_heirarchy=False when using conversion.py the behavior is different from using GMSO directly and passing infer_hierarchy=False to from_mbuild.

Here is a gist showing what I'm trying to say here.

CalCraven commented 4 weeks ago

To organize my thoughts, the four use cases are:

  1. Use mBuild to load xyz, infer_hierarchy=True
  2. Use mBuild to load xyz, infer_hierarchy=False
  3. Use GMSO to load xyz, infer_hierarchy=True
  4. Use GMSO to load xyz, infer_hierarchy=False

As @chrisjonesBSU has in his gist, 1-3 gives a layered compound with 1 child, which then itself has 8 children particles. Method 4 gives a flat compound with 8 children.

We should agree on preferred behavior. I actually think the current output from Method 4 is what I would expect for all compounds made from xyz files. Do you agree or disagree @chrisjonesBSU?

My reasoning is since there is no hierarchy present in an xyz file, I would expect a consistent flat compound. Infer_hierarchy therefore should not make a difference either way, as what happens in the current mBuild methods.

In order to obtain this behavior, I think we could either tweak the to_mbuild function so that any topologies without molecule or residue information give a single compound with multiple particles. To me, that seems like the preferred way to convert to a compound without extra hierarchy info. However, we need to test that, as it could screw any behavior from other readers if we modify the to_mbuild method, which every other conversion pathway also takes.

An optional workaround is to just set the molecule/residue information when reading in xyz files such that the to_mbuild can be kept the same, but gives us the preferred outcome. Since that info is not really part of the xyz file, I think manually setting it in the xyz reader in a generalized manner is a way we get consistent results. It would need to be set such that we get the flat compound when infer_hiearchy=True, so gotta dig into that bit of the to_mbuild function still.

chrisjonesBSU commented 1 day ago

Still need to remove lampsdata.py and lamps unit tests.