MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.26k stars 641 forks source link

repr of TopologyObjects should show order of underlying atom IDs #4181

Closed orbeckst closed 11 months ago

orbeckst commented 1 year ago

Expected behavior

Representations of topology objects should list atom IDs in the order of the underlying AtomGroup.

Actual behavior

I found it quite confusing when I looked at a Dihedral object and the order of atoms was different from my carefully crafted order to define the dihedral:

>>> dih = u.atoms[[10, 0, 1, 3]].dihedral
>>> list(dih.atoms)
[<Atom 11: HG1 of type 3 of resname MET, resid 1 and segid 4AKE>,
 <Atom 1: N of type 56 of resname MET, resid 1 and segid 4AKE>,
 <Atom 2: HT1 of type 2 of resname MET, resid 1 and segid 4AKE>,
 <Atom 4: HT3 of type 2 of resname MET, resid 1 and segid 4AKE>]
>>> repr(dih)
'<Dihedral between: Atom 3, Atom 1, Atom 0, Atom 10>'

I’d much rather have the actual order output.

Relevant code

Why do we do something special to the order of atom indices in an TopologyObject when printing the repr? Namely in https://github.com/MDAnalysis/mdanalysis/blob/4f36e2a06eb2d04274dfe4510910986c2dd17681/package/MDAnalysis/core/topologyobjects.py#L118

indices = (self.indices if self.indices[0] < self.indices[-1] else self.indices[::-1])

why do we reverse the order?

Code to reproduce the behavior

>>> import MDAnalysis as mda
>>> from MDAnalysis.tests.datafiles import PSF, DCD,  GRO, PDB, TPR, XTC, TRR,  PRMncdf, NCDF
>>> u = mda.Universe(PSF, DCD)
>>> dih = u.atoms[[10, 0, 1, 3]].dihedral
>>> list(dih.atoms)
[<Atom 11: HG1 of type 3 of resname MET, resid 1 and segid 4AKE>,
 <Atom 1: N of type 56 of resname MET, resid 1 and segid 4AKE>,
 <Atom 2: HT1 of type 2 of resname MET, resid 1 and segid 4AKE>,
 <Atom 4: HT3 of type 2 of resname MET, resid 1 and segid 4AKE>]
>>> repr(dih)
'<Dihedral between: Atom 3, Atom 1, Atom 0, Atom 10>'

Current version of MDAnalysis

richardjgowers commented 1 year ago

@orbeckst the "forwards" and "backwards" of bonds/angles/torsions are equivalent, so there's a canonicalisation check to make sure that the first atom is less than the last. Internally we often use this check to deduplicate bonds.

WRT this repr, I think I've taken this a little too far and made the repr print out the "canonical" order of the atoms, rather than the true internally stored order of the data. Fixing it seems like a good idea

ztimol commented 12 months ago

Just want to confirm if the canonicalisation check is still required at all? If not we could remove

indices = (self.indices if self.indices[0] < self.indices[-1] else self.indices[::-1])

and simply return the representation in the order of self.indices.

richardjgowers commented 12 months ago

Yeah removing that is harmless and would solve the issue

On Sat, Jul 8, 2023 at 08:53, ztimol @.***> wrote:

Just want to confirm if the canonicalisation check is still required at all? If not we could remove

indices = (self.indices if self.indices[0] < self.indices[-1] else self.indices[::-1])

and simply return the representation in the order of self.indices.

— Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/4181#issuecomment-1626900633, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB2PB7IGQK7QIQSMMO3XPEGXDANCNFSM6AAAAAAZWIHAY4 . You are receiving this because you commented.Message ID: @.***>