openforcefield / openff-toolkit

The Open Forcefield Toolkit provides implementations of the SMIRNOFF format, parameterization engine, and other tools. Documentation available at http://open-forcefield-toolkit.readthedocs.io
http://openforcefield.org
MIT License
318 stars 93 forks source link

Atom names lost when adding `_SimpleMolecule`s to a topology #1927

Closed mattwthompson closed 3 months ago

mattwthompson commented 3 months ago

Describe the bug

Atom names (_SimpleAtom.name), and likely other values are, are lost when adding a _SimpleMolecule to a topology because this is lost in the round-trip between its dict representation.

To Reproduce

Take for granted that I have a _SimpleMolecule in memory with non-default atom names.

ipdb> topology.n_atoms
0
ipdb> molecule.atom(0).name
'CH3'
ipdb> topology.add_molecule(molecule)
1
ipdb> topology.atom(0).name
''
ipdb> import copy
ipdb> copy.deepcopy(molecule).atom(0).name
''
ipdb> molecule.to_dict()["atoms"][-1]
{'metadata': {'residue_name': 'NME', 'residue_number': '526', 'insertion_code': ' ', 'chain_id': 'A'}, 'atomic_number': 1}
ipdb> 'name' in molecule.to_dict()["atoms"][-1]
False

Output

Computing environment (please complete the following information):

Additional context

Under the hood, Topology.add_molecule calls _SimpleMolecule.__deepcopy__, which goes through the dict representation. There's probably more that's lost.

mattwthompson commented 3 months ago

I'll take a stab at this tomorrow.

j-wags commented 3 months ago

This seems like a clear bug and the fix is likely unambiguous. I'll be offline tomorrow, Monday, and Tuesday, so feel free to self-merge and cut a Toolkit release without waiting for my approval!