mosdef-hub / gmso

Flexible storage of chemical topology for molecular simulation
https://gmso.mosdef.org
MIT License
52 stars 33 forks source link

Issue with Molecule numbering #785

Closed CalCraven closed 8 months ago

CalCraven commented 10 months ago

I think we need to firm up the strictness of numbering molecules and residues in GMSO. I think we should specify an index of 1 or 0 as the starting index for molecule.number, residue.number. This will hopefully improve the usefulness of it in the writers.

# comparing residue numbering in mBuild
import foyer; import gmso; import mbuild as mb
cpd = mb.load("CCOC", smiles=True)
ff = foyer.Forcefield(name="oplsaa")
pmd2 = ff.apply(cpd)
pmd1 = cpd.to_parmed(infer_residues=True)
for pmd in [pmd1, pmd2]:
    print(pmd.residues[0].number)
    top = gmso.external.from_parmed(pmd)
    print(top.sites[0].molecule)
    print(top.sites[0].residue)
top = gmso.external.from_mbuild(cpd)
print(top.sites[0].molecule)
print(top.sites[0].residue)

>>>>>>OUTPUT
-1
Molecule(name='Compound', number=-1)
Residue(name='Compound', number=-1)
0
Molecule(name='RES', number=0)
Residue(name='RES', number=0)
Molecule(name='Compound', number=1)
Residue(name='Compound', number=1)

I think some of the other ways you can read in molecules could also cause differences.

CalCraven commented 10 months ago

As per conversation with @daico007, we think it's best to not enforce molecule or residue numbering at the read in stage of GMSO, but enforce it more strictly at the writer stage. Because of this, we will make a molecule_reindex utility that will be useful for writers to look through these tags and modify the topology to for a given writer.

The only thing we want to enforce is the parmed default case where all residues are given a value of -1, we want move that to be index 0 in GMSO to alighn with what it would look like if you forcefield using foyer and get the residues.

daico007 commented 8 months ago

Fixed in #787