LorenFrankLab / spyglass

Neuroscience data analysis framework for reproducible research built by Loren Frank Lab at UCSF
https://lorenfranklab.github.io/spyglass/
MIT License
92 stars 42 forks source link

Incorrect syntax in tutorial notebook 0 #72

Closed cristofer-holobetz closed 3 years ago

cristofer-holobetz commented 3 years ago

Cell 13 in 0_intro.ipynb says the following:

The following query returns all interval_list_name that is not 01_s1 or 04_r2

((IntervalList & {'nwb_file_name':nwb_file_name2}) - ({'interval_list_name':'01_s1'} and \ {'interval_list_name':'04_r2'})).fetch('interval_list_name')

Desired output:

array(['02_r1', '03_s2', 'pos 0 valid times', 'pos 1 valid times', 'pos 2 valid times', 'pos 3 valid times', 'raw data valid times'], dtype=object)

Actual output:

array(['01_s1', '02_r1', '03_s2', 'pos 0 valid times', 'pos 1 valid times', 'pos 2 valid times', 'pos 3 valid times', 'raw data valid times'], dtype=object)

n.b. that the actual output __still includes '01_s1'__

When we use python logical operators and and or we are subject to python's 'truthy' evaluation behavior. All numbers are truthy except for 0 (and 0.0 etc.). Similarly, all objects, including dictionaries, are truthy except for None. The statement dict1 and dict2 first checks to see if dict1 is truthy. In our case, it is, so the statement moves on to check the truthiness of dict2. It also finds that dict2 is truthy (because it is not of NoneType). The key to the unintended output is that the return value of and is the actual value of the last evaluated clause. So the output of dict1 and dict2 is dict2.

The end result of all this is that the given code:

((IntervalList & {'nwb_file_name':nwb_file_name2}) - ({'interval_list_name':'01_s1'} and \ {'interval_list_name':'04_r2'})).fetch('interval_list_name')

is equivalent to:

((IntervalList & {'nwb_file_name':nwb_file_name2}) - ({'interval_list_name':'04_r2'})).fetch('interval_list_name')

because the and statement collapses ({'interval_list_name':'01_s1'} and {'interval_list_name':'04_r2'}) into {'interval_list_name':'04_r2'}

I'm pretty sure this is the cause of the unintended behavior, because you can see the effects of short-circuiting if you replace the and with an or.

TL;DR I don't think python truth operators should be used in the way presented when using dictionaries as the conditions in a Datajoint query.

khl02007 commented 3 years ago

Thanks, should be fixed now