Open hmacdope opened 1 year ago
I would have thought that adding a duplicate bond should raise an exception.
(I don’t understand the rationale for the current behavior so I would consider changing to raising an error as a fix. Perhaps someone can explain why the current behavior is better.)
I know that at least Maestro (maybe also pymol) will let you add two connect entries if you have a double bond in a PDB file, raising an exception would probably kill off reading those types of files? edit: we could also pass that responsibility down to the reader itself tbh
That's a reader issue — it should create one bond with the correct bond order.
I agree with what we have discussed above that the reader should be responsible for inputting the correct bond orders.
To change this behaviour we would need to audit the readers for duplicate behaviour and then enforce in each reader which would be a rather large breaking change?
I am thinking that the Universe
representation of some files (ESP pdb files which seemed to fail often my tests) will change. With this in mind this likely needs to be 3.0?
Honestly I'd probably be ok with this being pre-3.0, it feels more like a bug / undefined behaviour thing than an API break (unless there's some specific reason for its implementation - was there anything specific from git blame)?
Nothing specific from git-blame AFAIK. It will take a bit of time to audit all the readers that add bonds so watch this space.
In the course of moving forward on #3777 the question of what to do with duplicate bond entries has come up.
The key problem is that this is a valid input for adding bonds:
Where you can have a bond of the same index but different auxiliary values (types, orders, guesses). Only the first bond entry is stored as far as I can tell in the process of de-duplication when the
TopologyGroup
is constructed.@richardjgowers and I discussed raising an error on duplicate entry in one of our meetings but this would be an API break from the current behaviour which is to remove duplicate entries preserving the first entry. This behaviour has a test
core/test_universe.py::TestAddTopologyObjects::test_add_reversed_duplicates
and about 50 other tests fail if you enforce duplication==error.Currently the approach taken in #3777 matches this behaviour but as far as I can see there is not clear documentation of this behaviour (which entry will be removed).
The docs currently have the rather cryptic "Only one of the two permutations is stored, typically the one with the lower atom number first." in the developer notes and the following in the
TopologyGroup
docstring "Duplicate entries are automatically removed upon creation and combination of different Dicts. This means a bond between atoms 1 and 2 will only ever appear once in a dict despite both atoms 1 and 2 having the bond in their bond attribute." At least that was all that I could see about duplicate bonds.My main points/questions are.
My instinct is to move forward with the current system to avoid and API break and potentially change behaviour if desired in 3.0?
Code to reproduce the behavior
Minimal example