int-brain-lab / ibllib

IBL core shared libraries
MIT License
61 stars 36 forks source link

Improve recording of datasets for release #773

Closed k1o0 closed 4 months ago

k1o0 commented 5 months ago

Currently for users loading data with alfio or the spike sorting loader the ONE data tracking is bypassed, making it complicated for users to determine which datasets were used in analysis. We should determine why the spike sorting loader isn't used. If it's a question of speed, this can be improved. Otherwise perhaps recording can be done within the spike sorting loader or somehow withing the alfio loader functions.

oliche commented 4 months ago

It turns out the loader does support the data tracking.

One minor comment could be to have a better way to access the dataframe instead of only the datasets ids.

So far dset_loaded = one._cache['datasets'].loc[one._cache['datasets'].index.isin(one._cache['_loaded_datasets'], 'id'), :] does the trick but should this be a one method ?