Open lilyminium opened 5 years ago
sigh
Thanks for documenting this. It is really helpful that you're taking a global view.
>>> gsd = mda.Universe(GSD)
>>> gsd.atoms.resnames
array([0, 1, 2, ..., 647, 647, 647], dtype=object)
>>> gsd.select_atoms("resname 0")
<AtomGroup with 0 atoms>
>>> gsd.select_atoms("resname 2")
<AtomGroup with 0 atoms>
^ A consequence of integer resnames.
Ew gross. So one option is to make some “reserved” topogyattrs statically typed.
On Wed, Oct 23, 2019 at 16:07, Lily Wang notifications@github.com wrote:
gsd = mda.Universe(GSD) gsd.atoms.resnames array([0, 1, 2, ..., 647, 647, 647], dtype=object) gsd.select_atoms("resname 0") <AtomGroup with 0 atoms> gsd.select_atoms("resname 2") <AtomGroup with 0 atoms>
^ A consequence of integer resnames.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MDAnalysis/mdanalysis/issues/2362?email_source=notifications&email_token=ACGSGB4W4MMEQQEIFIH4ZULQQBSCBA5CNFSM4I7MLWZKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOECBYFAQ#issuecomment-545489538, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACGSGB2PUVNK6Y2OZIYAWSLQQBSCBANCNFSM4I7MLWZA .
This would be cool to address in 3.0 as it "breaks" things, but it needs fixing (ping #3800)
Consolidating attributes looks like a project to be put on the 3.0 roadmap as it is related to
With PR #3753 almost done, we will have a new guesser infrastructure. We should then clean-up our attributes.
And by "putting on the roadmap" I mean "We need to figure out how we will make it actually happen." — something for the next dev meeting.
TopologyAttrs are a bit of a mess right now, and the documentation is not up-to-date. This is a list of some inconsistencies I found.
atomiccharge
The GAMESS parser uses the TopologyAttr
atomiccharge
for the atomic number.This is technically correct, but I think most people would expect the
atomiccharge
to refer to a partial atomic charge. A more informational name might beatomicnumber
.atomnum
The Desmond DMS parser uses the TopologyAttr
atomnum
. The Desmond User Guide (section 17.1.1) asserts that 'anum' refers to the atomic number of a particle, but this seems implausible:element vs type
A non-exhaustive search indicates that only the prmtop TOPParser implements the
element
TopologyAttr.I looked at formats like GAMESS and XYZ that as far as I know explicitly state element, as well as formats like PDB that either read an element column or guess it from the name.
type
alternately refers to elements:or to force field / Autodock atom types:
My opinion is also that the 0 mass for the Hs and O should have raised a warning.
resid vs resnum
I can't see a difference between
resnum
andresid
.GROMACS indexes from 1 in user-readable files, as well.
chainIDs
The Desmond parser appears to be the only one to implement the
chainID
TopologyAttr.Comments in the code suggest that all other parsers turn chains into segments.
tempfactors vs bfactors
These parsers have tempfactors but not bfactors:
MMTFParser has bfactors but not tempfactors.
TPRParser has neither bfactors nor tempfactors, despite the documentation thinking it does.
This impacts on selection language. Selecting
same bfactor as *selection*
only works for MMTFs because selection language doesn't recognise the tempfactor attribute.ids
I don't understand the meaning of atom IDs for file formats like XYZ. TPR atom IDs also start from 0, despite GROMACS indexing from 1.
masses
Failing to guess masses gives masses of 0. This is done atom-by-atom, so users may only check the end of an array like the one below, not realising that they do not have appropriate masses elsewhere. This seems like a bad idea.
resnames
The GSDParser has numbers as
resnames
. Should this be the case?