AllenInstitute / ecephys_etl_pipelines

Pipelines and modules for processing extracellular electrophysiology data
Other
1 stars 2 forks source link

Create Visual Behavior Ecephys project class (cloud-based) #30

Closed wbwakeman closed 2 years ago

wbwakeman commented 2 years ago

This is a project-level class that is used by public AllenSDK users to get metadata about all sessions. This allows them to filter and retrieve specific files of interest to them.

Need to create a Visual Behavior Ecephys project class. This will analogous to the VBO cache class implemented in

https://github.com/AllenInstitute/AllenSDK/blob/master/allensdk/brain_observatory/behavior/behavior_project_cache/behavior_project_cache.py (modern - Sprint 2021)

Implement a project class that provides methods to:

We are only supporting a from_s3 API. There will be no from_lims API for this data release.

Tasks

Instructions to create bucket https://github.com/AllenInstitute/informatics_data_release_tools/tree/main/deploy

Instructions to upload data https://github.com/AllenInstitute/informatics_data_release_tools

Validation criteria:

danielsf commented 2 years ago

There does appear to be an existing implementation of this class using just the from_lims and warehouse APIs

https://github.com/AllenInstitute/AllenSDK/blob/master/allensdk/brain_observatory/ecephys/ecephys_project_cache.py

danielsf commented 2 years ago

32 should probably be completed before code is actually written for this ticket

danielsf commented 2 years ago

Metadata tables we need to be able to get are:

units.csv
channels.csv
probes.csv
sessions.csv
behavior_sessions.csv

These should all be such that they can be read in with pandas.read_csv(), yielding a DataFrame.

For now: NWB files will only be looked up from sessions.csv (because we aren't releasing behavior sessions yet). The column for file ID will be called file_id.