Open lilyminium opened 3 years ago
I might be misunderstanding, but can't we already do something similar through select_atoms?
select_atoms
doesn't return ordered atoms, and I think that the current SMARTS matching will return all 6 atoms in the '[#6]1:[#6]:[#6:4]:[#6:3]:[#7:2]:[#6:1]:1'
ordered by index, rather than only the 4 tagged atoms in the order of tagging.
I see, do we have an ordering convention for select_atoms? In some ways I'd prefer we first make sure we fulfill one way of selection matching than start branching to other ones.
I'm not sure I'd have to use this feature enough to have a strong opinion for it. If the potential user base is large I'm happy for it to be implemented as long as it doesn't change expected defaults.
I'm not sure we ever give a promise on the ordering from select_atoms
, but our use of np.unique
sorts the indices as a side-effect.
Is your feature request related to a problem?
OpenForceField has a really nice function called
chemical_environment_matches
that returns only the tagged atoms in a SMARTS substructure search. e.g. If you passed'[#6]1:[#6]:[#6:4]:[#6:3]:[#7:2]:[#6:1]:1'
, it would search the entire substructure but only return the tagged atoms, and ordered by the tag number.Describe the solution you'd like
Instead of going
openff.Molecule.from_rdkit(atoms.convert_to("RDKIT")).chemical_environment_matches("my smarts query")
, it would be super handy to implement it directly in MDAnalysis. The way OpenFF does it is just by looking at theGetAtomMapNum()
s of the smarts molecule; it looks fairly straightforward. Their code is below (MIT license):Describe alternatives you've considered
Don't do that.
Additional context