AllenInstitute / AllenSDK

code for reading and processing Allen Institute for Brain Science data
https://allensdk.readthedocs.io/en/latest/
Other
343 stars 149 forks source link

explicit naming of VBN stimulus blocks #2527

Open matchings opened 2 years ago

matchings commented 2 years ago

Describe the use case that is addressed by this feature. Currently the names of the different stimulus blocks in the Visual Behavior Neuropixels dataset are designated by numbers = [0,1,2,3,4,5]. This requires users to have specific knowledge of what each of these stimulus blocks mean. It would be much better to have an explicit / intuitive naming scheme for these blocks so that users can parse the data without having to look up what the blocks mean.

Describe the solution you'd like A proposed naming scheme for the blocks could be: ['change_detection_behavior', 'gray_screen_10s', 'gabor_receptive_field_mapping', 'gray_screen_5min', 'full_field_flash', 'change_detection_passive_replay']. @corbennett should have final say on this.

Describe alternatives you've considered An alternative could be to include metadata about the stimulus blocks as a separate attribute to provide the encoding of block number to what the stimulus and stimulus parameters are. Perhaps this should be implemented in addition to explicit naming of blocks for maximal informativeness.

Additional context Changing the stimulus_block naming would be disruptive to existing users of the VBN data, however the sooner we implement this, the fewer people we will disrupt. In any case, there should be clear notifications to users that something has changed, both in the changelog and potentially when loading the stimulus_presentations table as well.

This is a proposal for an improvement - @corbennett should be the final arbiter of whether and when this should be implemented.

matchings commented 2 years ago

This issue also now applies to VBO since the fingerprint stimulus has been added to the stimulus table (https://github.com/AllenInstitute/AllenSDK/issues/1190#event-7389764290).

Suggested values for stimulus_block for VBO are: ['initial_gray_screen_5min', 'change_detection_behavior', 'post_behavior_gray_screen_5min', 'natural_movie_one']

However i should note that not all VBO sessions have the gray screen period in them, so these names will need to dynamically apply depending on the session_type. Also, some sessions (session types beginning with OPHYS_2 and OPHYS_5 would be 'change_detection_passive' instead of behavior.

When this issue is being actively worked on, I / the science team can help work through what the expected stimulus_block names should be for each session_type.

matchings commented 1 year ago

@corbennett can you weigh in on my proposed stimulus block naming scheme here?

corbennett commented 1 year ago

I think the proposed block naming scheme sounds good. Obviously, we'll need to proceed cautiously here since there could be downstream dependencies and we'll be changing the data type for this column. We could potentially leave stimulus block but add a new column with the names. This would have the advantage of being backwards compatible with the current release (if a little clunky). @matchings what do you think?

matchings commented 1 year ago

@corbennett the recent release is so recent that i wouldn't worry about backwards compatibility. there are very few users at this point (at least compared to how much its going to increase in the coming years). i think its better to get it right now and make everything as straightforward ad clear as possible before we get too far down the line and don't have a choice anymore.

corbennett commented 1 year ago

@matchings Ok. I agree it would be better to fix this now. Seems like in order for pika to make sense of this, we would need to provide a clear mapping of block to name for VBN and VBO. VBN is easy (and you've already done it above!). For VBO, could we write something that can infer the block from 1) the other columns of the stim table and 2) any other info already present in the BehaviorSession object?

Then I think this would be relatively straightforward.

morriscb commented 1 year ago

To be clear, my preference would be for these names to be something stored in LIMS that we could pull for the different stimulus blocks. Failing that, we would need to be able to determine the name of the session from information in the stimulus table itself or data that is associated with and stored in the BehaviorSession object. My understanding is that while VBN is pretty consistent, there isn't as clear a mapping for VBO as some stimulus blocks are not present in some sessions.

If we can't determine a clear mapping from the data available, wouldn't be supportive implementing the names.

morriscb commented 1 year ago

Hey folks, I've started poking around the stimulus presentations tables for VBN there is a column called stimulus_name. This also exists for VBO when pulling from LIMS and using the branch rc/2.16.1. Does this column as is satisfy the request on this ticket? As an example here is naming for VBN ecephys session 1052342277.

Screen Shot 2022-10-12 at 2 32 55 PM

Additionally, here's a ophys session, behavior_session_id=954120560, from VBO pulled from LIMS using the rc/2.16.1 branch that has names as they currently stand for it's different stimulus blocks.

Screen Shot 2022-10-12 at 2 40 42 PM
matchings commented 1 year ago

@morriscb ah, seeing these values reminded me of issue https://github.com/AllenInstitute/AllenSDK/issues/1190 where @aamster added the "fingerprint stimulus" aka natural_movie_one to the VBO stimulus table. I am guessing maybe he added the same column to the VBN table as well?

but are the values being pulled from lims or are they hard coded in the SDK somewhere?

in any case, i think the existence of a stimulus_name column in both VBO and VBN satisfies half of this request, the other half is making sure that @corbennett and I agree on what the values of stimulus_name are, and that they are consistent and accurate across the two datasets.

For example, stimulus_block 0 is the same thing in VBN and VBN (change detection active behavior), but is named two different things in the tables you pulled above.

matchings commented 1 year ago

Actually, i am not sure if stimulus_name fully satisfies the requirement because stimulus name doesn't tell you what type of block something is.

For instance in the VBN example above, it says Natural_Images_Lum_Matched_set_ophys_G_2019 for VBN stimulus_block = 0 and stimulus_block = 5, which is technically correct because the same stimulus was shown, however what distinguishes these two blocks is that one is active behavior and one is passive viewing, which is important for the user to know. So i do think we need both stimulus_name and something that specifies the stimulus_block_type.

I am not sure how to determine the stimulus_block_type in an automated way though. i think there will always be some hard coded assumptions. But they should be relatively limited. The main ones I can think of are 1) the second instance of Natural_Images_Lum_Matched_set_ophys_G_2019 is always passive, and the first instance is active behavior 2) VBO session types starting inOPHYS_2 and OPHYS_5 are always passive, otherwise it is active behavior.

matchings commented 1 year ago

Also for the record, this also applies to behavior only sessions. They should have a stimulus_name and a stimulus_block_type. I dont think the mapping of stimulus_name to stimulus_block_type will be as difficult though because the behavior sessions are more homogenous.

morriscb commented 1 year ago

Hey all, I was able a new column to the stimulus presentations for VBO. I've created a set of NWB files for every session type in VBO with a new column named stimulus_block_name, preserving the old column in case it's being used by folks somewhere. One thing not covered here is how to label the data marked TRAINING. Should they just follow the same mapping as the behavior session or should they have their own unique tag besides change_detection_behavior/passive? What about the TRAINING sessions with gratings listed as their stimuli? I've put a list of the TRAINING sessions for VBO below. For reference while every other session currently has 3 blocks, the training sessions only have 1 block.

You can find the data on isilon at /allen/aibs/informatics/chris.morrison/PSB-44/behavior_session_id.nwb and here's a notebook to help with loading the files the checking them. Be sure to use the SDK branch rc/2.15.2 when reading the nwbs. Let me know what changes you would like to be made.

    "TRAINING_0_gratings_autorewards_15min",
    "TRAINING_1_gratings",
    "TRAINING_2_gratings_flashed",
    "TRAINING_3_images_A_10uL_reward",
    "TRAINING_3_images_B_10uL_reward",
    "TRAINING_3_images_G_10uL_reward",
    "TRAINING_4_images_A_handoff_lapsed",
    "TRAINING_4_images_A_handoff_ready",
    "TRAINING_4_images_A_training",
    "TRAINING_4_images_B_training",
    "TRAINING_4_images_G_training",
    "TRAINING_5_images_A_epilogue",
    "TRAINING_5_images_A_handoff_lapsed",
    "TRAINING_5_images_A_handoff_ready",
    "TRAINING_5_images_B_epilogue",
    "TRAINING_5_images_B_handoff_lapsed",
    "TRAINING_5_images_B_handoff_ready",
    "TRAINING_5_images_G_epilogue",
    "TRAINING_5_images_G_handoff_lapsed",
    "TRAINING_5_images_G_handoff_ready",
morriscb commented 1 year ago

Still waiting on clarification for this.

matchings commented 1 year ago

@morriscb sorry for the delay. If all of the training sessions only have 1 block, then change_detection_behavior is the appropriate value for stimulus_block_name.

However i thought that the training sessions that end in _epilogue have the 5 mins gray screen at beginning and end of the session plus the 5 minutes of natural movie presentation, which would need to be labeled with a unique stimulus_block_name, similar to the OPHYS sessions. Can you double check these ones to be sure?

morriscb commented 1 year ago

Thanks for the info, @matchings. I'm regenerating the stimulus tables now to double check. Will get back to ya.

morriscb commented 1 year ago

Update. You are correct, the _epilogue and all sessions starting with TRAINING_5 have behavior session, 5 minutes of grey screen after, and the natural movie for a total of 3 blocks. This the same as all of the sessions that start with OPHYS_. Currently the 5 minute gray screen block is not explicitly added to any of the sessions. So as it stands, TRAINING_0-4 have 1 block and all other sessions have 3 blocks.