catalystneuro / ndx-holographic-stimulation

The extension of OptogeneticSeries to include 3D geometrical stimulation pattern
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Take a more list-based approach to writing the stimulation data #11

Open CodyCBakerPhD opened 9 months ago

CodyCBakerPhD commented 9 months ago

would replace #4 by making it non-applicable

cc: @bendichter @pauladkisson @weiglszonja

In the current state of #9 we are representing the stimulation values according to the classic NWB ogen representation of a sparse matrix of shape number_of_timestamps x number_of_rois; for a larger number of stimulation times and number of ROIs this is resulting in mostly zero-padding, which is fairly inefficient

On a splinter branch starting from #9 I think we should try a more list-based representation of the core data fields, closer to how the sister extension does it but also extended to multiple patterns and roi combinations

My proposal would be similar to how LabeledEvents work in ndx-events where at the end of the day what we have is essentially a list of tuples of the following form: [(pattern_1, stimulus_time_1), (pattern_2, stimulus_time_2), ...] which gets implemented under the hood as each stimulus_time being a not-strictly increasing ($\geq$) ordered entry in the timestamps dataset, together with an equal-length dataset for the pattern indices, which index a small list of all the unique patterns spanned by the stimulation

I think it would be more intuitive for experimentalists, more consistent with the source data formats, and more performant for storage and access

This would mean we no longer inherit from TimeSeries (the data field would be ambiguous since we have two main types of it, the repeated IDs and the unique set they index; also we would forbid rate anyway since exact timestamps are required) nor would we continue to attempt to adapt the OptogeneticSeries to our new shape

Main downside to that is difficulty associating it with the file as being a stimulus; see https://github.com/NeurodataWithoutBorders/nwb-schema/pull/559 for efforts to overturn that

CodyCBakerPhD commented 9 months ago

@alessandratrapani Note this would still be completely compatible with the future expansion highlighted in #7 if that is ever requested by new experiments (but not necessary now or possibly ever); we would just add an extra entry to each tuple like [(pattern_1, roi_targeted_by_stimulation_1, stimulus_time_1), (pattern_2, roi_targeted_by_stimulation_2, stimulus_time_2), ...] etc, by adding an extra dataset for the ROI repetitions and indexing, same as the proposed pattern setup