AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
343 stars 149 forks source link

Add behavior performance information to the `behavior_session_table` cache/manifest summary table #2570

Closed DowntonCrabby closed 1 year ago

DowntonCrabby commented 2 years ago

Describe the use case that is addressed by this feature. End users may wish to select sessions based on behavioral performance. For that reason it would be very helpful to add some of the values from the get_performance_metrics attribute to the summary behavior_session_table cache/manifest table.

@corbennett to weigh in on if he'd like this same thing for VBN (though I don't think that dataset has the performance_metrics table?)

Describe the solution you'd like We would like to include the following information in the behavior_session_table:

Additional context For any documentation updates needed here are data types and column definitions trial_count: int, The length of the trial dataframe (including all go, catch and aborted trials) go_trial_count: int 64, number of go trial types hit_trial_count:int 64, number of catch trial types miss_trial_count int 64 number of trials with a hit behavior response false_alarm_trial_count int 64 number of trials with a miss behavior response correct_reject_trial_count int 64 number of trials with a false alarm behavior response engaged_trial_count int 64 number trials where the mouse was engaged (reward rate>2 rewards/min)

DowntonCrabby commented 2 years ago

tagging @matchings

aamster commented 1 year ago

@matchings please note that adding this will require that the pkl file be read for all sessions in order to process the trial data for each session. This adds a bottleneck to producing this table and slows it down. This isn't a problem when creating this table offline for external use, but I know your team uses the from_lims api. When @mattjdavis tried this on his machine it took 2 hours to load all the data. For on-the-fly from_lims api we can skip loading this data. Would this work? Otherwise, we'll need to work with LIMS to get this stored in the LIMS database.

matchings commented 1 year ago

@aamster I am ok with skipping this data for the from_lims api. It is primarily for external users, so as long as it is in the released metadata table, I am good with that. If we need these values for internal use we can figure out a workaround. Loading from pkl every time is way too painful to be worth it in from_lims version.