AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
335 stars 149 forks source link

Linking the VBN stim and trials tables #2563

Closed corbennett closed 1 year ago

corbennett commented 1 year ago

Describe the use case that is addressed by this feature. Currently, it isn't obvious to users how to link the behavioral events in the trials table to the stimulus presentations in the stim table for the VBN data release (this likely also applies to the VBO data). Many analyses involve looking at visual responses to stimuli that are associated with some category of behavioral response. This requires using the trials table to identify which trials have the relevant behavior, then figuring out which stimuli in the stim table are associated with those trials (since precise stimulus times can only be found in the stim table). This process trips up many users.

Describe the solution you'd like I'd like to add a column to the stim table that explicitly links each stimulus row to the behavioral trial that it occurred in. We can call this column 'behavior_trial_id' . This would allow users to easily merge these two tables.

The following code demonstrates how you might do this:

trials = session.trials   #get trials table
stim_presentations = session.stimulus_presentations #get stim presentations table
trial_start_times = trials['start_time'] #get the start times for every behavior trial
stim_presentation_starts = stim_presentations.start_time #and the start times for every stim presentation

#Assign every stim to a trial based on when it was shown
stim_trial_assignments = np.searchsorted(trial_start_times, stim_presentation_starts) - 1
stim_presentations['behavior_trial_id'] = stim_trial_assignments

#Clean up result
#First don't assign trials to stims that occurred outside the behavior
stim_presentations.loc[stim_presentations.start_time>trials.iloc[-1]['stop_time'], 'behavior_trial_id'] = np.nan

#Second copy the trial assignments from the active block to the passive block
stim_presentations.loc[stim_presentations.stimulus_block==5, 'behavior_trial_id'] = stim_presentations[stim_presentations.active]['behavior_trial_id'].values

#Now users can merge the two tables with the following line:
stim_trials = stim_presentations.merge(trials, left_on='behavior_trial_id', right_index=True, how='left')

Describe alternatives you've considered We currently require users to figure this merge out themselves. I think this the only other alternative.

Additional context Add any other information about the feature request here.

Do you want to work on this issue? Yep. Happy to help on this.

morriscb commented 1 year ago

Hey Corbett, looking at the trials table, the index for that table is called trials_id where as you use behavior_trial_id in your code snippet above. Is there any reason you do not want to use trials_id in the stim_presentations table? Having the names match would make it clearer that these two tables can be merged on this value.

morriscb commented 1 year ago

Hey Corbett, how specific is this addition to the VBN data? I was able to get it running on ecephys data but the code above can't be applied to VBO datasets. Right now the VBO data does not have active as a column and the number of stimulus blocks appears to be shorter (2 instead of 5 on the ophys_experiment I looked at). Is there a way we can generalize this addition or is going to end up needing to be VBN specific?

matchings commented 1 year ago

@morriscb I would like for this to apply to the VBO data as well. Sorry if that complicates things...

However I believe that resolving https://github.com/AllenInstitute/AllenSDK/issues/2527 would potentially solve this because the stimulus blocks would have explicit names (instead of ambiguous / meaningless numbers), and the name of the block that we are trying to join with the trials table should be the same between VBN and VBO (it would be change_detection_behavior according to my proposed naming scheme in #2527)

corbennett commented 1 year ago

I agree that resolving https://github.com/AllenInstitute/AllenSDK/issues/2527 will help generalize this code. To be clear though, there will likely still be some customization needed given the differences between VBO and VBN (eg, handling the passive replay block in VBN which isn't part of the VBO stim table).

morriscb commented 1 year ago

Already a part of 2.14.1. Is calculated upon request from the SDK for the stim table.