MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.28k stars 643 forks source link

'and' keyword in selection is not commutative #1875

Open kain88-de opened 6 years ago

kain88-de commented 6 years ago

Expected behaviour

the selections 'not resid 62 and around 3 group sites' gives the same results as 'around 3 group sites and not resid 62'

Actual behaviour

The resid filter is only applied when it's named first

Code to reproduce the behaviour

In [1]: from MDAnalysisTests.datafiles import PDB

In [2]: u = mda.Universe(PDB)

In [3]: protein = u.select_atoms('protein')

In [4]: protein.select_atoms('name CA and around 3 (resid 60) and not resid 61').resids
Out[4]: array([59, 61], dtype=int32)

In [5]: protein.select_atoms('name CA and not resid 61 and around 3 (resid 60)').resids
Out[5]: array([59], dtype=int32)

Currently version of MDAnalysis:

(run python -c "import MDAnalysis as mda; print(mda.__version__)") 0.18.0

kain88-de commented 6 years ago

https://www.mdanalysis.org/docs/documentation_pages/selections.html#ordered-selections

is this connected?

richardjgowers commented 6 years ago

@kain88-de I think that only affects the ordering of selections, not the contents. I just noticed you reproduced using the test files, I'll look into this.

richardjgowers commented 6 years ago

@kain88-de I think it might be that the around token is eating the not resid 61

In [12]: p.select_atoms('around 3 (resid 60) and not resid 61').resids
Out[12]:
array([59, 59, 59, 59, 59, 61, 61, 61, 61, 62, 63, 63, 63, 63, 64, 64],
      dtype=int32)

In [13]: p.select_atoms('(around 3 resid 60) and not resid 61').resids
Out[13]: array([59, 59, 59, 59, 59, 62, 63, 63, 63, 63, 64, 64], dtype=int32)
kain88-de commented 6 years ago

does it make sense that the around eats the not resid 61? I found it quite surprising. I thought because of the braces this would not happen.

richardjgowers commented 6 years ago

So I guess here and #1900 we want the default behaviour to be that keywords which read the next thing to only read one extra thing, unless you explicitly use brackets


# This is desired behaviour
bonded THING and not OTHER == (bonded THING) & (not OTHER)

# This is current behaviour
bonded THING and not OTHER != bonded (THING and not OTHER)
kain88-de commented 6 years ago

yes that sounds correct.