int-brain-lab / ONE

Open Neurophysiology Environment
MIT License
17 stars 5 forks source link

Remove support for dataset table without session index #88

Closed k1o0 closed 1 month ago

k1o0 commented 1 year ago

Currently One._check_filesystem supports the datasets input being a slice without the eid index. This is important because by default indexing a single level returns a view without that level (c.f. one._cache.datasets.xs(eid, level='eid', drop_level=False)):

one._cache.datasets.loc[eid]
Out[20]: 
                                      file_size  ...                                           rel_path
id                                               ...                                                   
0033d1df-2f4c-4da9-a8f1-69a67bc803e9       2600  ...  raw_behavior_data/_iblmic_audioOnsetGoCue.time...
098b0067-8373-4d2a-8b20-777b7efc6607      28960  ...                  alf/_ibl_wheelMoves.intervals.npy
0a00137e-2de2-4055-877e-b37d99a9c3df       6120  ...                         alf/_ibl_trials.repNum.npy
[3 rows x 8 columns]

If the eid level was assured, we could drop the session_path column (which is cumbersome to generate on Alyx and accounts for ~13% of the table size) and do joins instead (see https://github.com/int-brain-lab/ONE/issues/134). This might be slightly less performant, especially if the eid index is missing and has to be fetched again from the original table.