Add specification for an improved vesion of ndx-patterned_ogen

alessandratrapani commented 9 months ago

This PR is for an improved version of #1 to include some requests from Adesnik group:

Store the ROIs that have been targeted in targeted_rois, a field of OptogeneticStimulusTarget (Hologram has been replaced by OptogeneticStimulusTarget), for a roi_table_region referring to a PlaneSegmentation for targeted rois.
Remove from TemporalFocusing, SpiralScanning and OptogeneticStimulusPattern, all parameters referring to temporal pattern (num_sweeps,time_per_sweep,inter_sweep_interval). In this way they refer just to the geometry of the stimulus pattern. num_sweeps,time_per_sweep,inter_sweep_interval have been replaced with frequency and pulse_width additional columns in PatternedOptogeneticStimulusTable.
power,frequency and pulse_width can be defined either as scalar or as arrays. If power,frequency and pulse_width are defined as a scalar, it is assumed that all the ROIs defined in targets receive the same stimulus power,frequency and pulse_width. However, we can also define power,frequency and pulse_width as 1D arrays of dimension equal to the number of ROIs in targets, so we can define different power,frequency and pulse_width for each target.
- [x] create a .py for defining the classes
- [x] update ndx-patterned-ogen.extensions.yaml
- [x] implement testing
- [x] add validation for power,frequency and pulse_width dimension (follow-up PR: https://github.com/catalystneuro/ndx-patterned-ogen/pull/5)
- [x] add tutorial
- [x] add schema diagram
- [x] add set and get functions (follow-up PR: https://github.com/catalystneuro/ndx-patterned-ogen/pull/7)
- [x] change quantitative attributes to dataset and add unit as an attribute of the dataset (follow-up PR: https://github.com/catalystneuro/ndx-patterned-ogen/pull/6)
- [x] add documentation (follow-up PR: https://github.com/catalystneuro/ndx-patterned-ogen/pull/8)

alessandratrapani commented 9 months ago

So to summarize, I'm suggesting that OptogeneticStimulusTarget have a required field targeted_rois, which stores the actual targeted centroids, and an optional field segmented_rois, which is a DynamicTableRegion of a PlaneSegmentation that represents a matched post hoc segmentation.

Thank you for the suggestion. I have changed that accordingly!

My only concern here is what happens if the holographic stimulus has an irregular frequency or variable pulse width?

We can define a time series that describes the power signal and reference that in the table, as we discussed in the case the power varies during the stimulus onset. However, this is an extra feature we will implement in a future release if needed.

pauladkisson commented 9 months ago

We can define a time series that describes the power signal and reference that in the table, as we discussed in the case the power varies during the stimulus onset. However, this is an extra feature we will implement in a future release if needed.

Ok, seems like a reasonable compromise. Maybe we should mention it in the docs somewhere so that future users/devs will know about this decision.

alessandratrapani commented 9 months ago

Also looks like the Ruff style checks are failing (mostly unused imports).

Ok fix some of them but the ones in src/pynwb/ndx_patterned_ogen/init.py I cannot remove.

@pauladkisson if it's all good, I am going to merge this PR and open another one for the documentation

CodyCBakerPhD commented 9 months ago

So to summarize, I'm suggesting that OptogeneticStimulusTarget have a required field targeted_rois, which stores the actual targeted centroids, and an optional field segmented_rois, which is a DynamicTableRegion of a PlaneSegmentation that represents a matched post hoc segmentation.

Thank you for the suggestion. I have changed that accordingly!

This sounds like a good idea in the absence of full masks for region targeted; I would however recommend double checking if the approach sounds good with both Pinto and Histed (i.e., if they have ready access to these centroids as opposed to the target masks)

In particular, just reminding of the original proposed approach...

The original intent here was to have two PlaneSegmentation tables; one for post-hoc ROI (possibly cell) identification, by software such as Suite2P; the other for targeted ROIs. Additional columns on both tables can indicate if the ROI is a cell, and the two tables can be harmonized with the use of a global_roi_id field that matches ROI IDs from one table to the other.

This approach still seems the most generalizable to me, in that if someone does know more specific details (beyond just the centroid), they are able to attach them to the 'target' table.

I will point out that even if a full mask is not known (as in the case of Adesnik manual centroids), this is equivalent to a pixel_mask of length 1.

We can define a time series that describes the power signal and reference that in the table, as we discussed in the case the power varies during the stimulus onset. However, this is an extra feature we will implement in a future release if needed.

I would add a brief section about this to the README, with instructions to raise an issue for a future user to request this feature

CodyCBakerPhD commented 9 months ago

There was inconsistency in the doc fields of the spec; some were full sentences, others weren't. I adjusted all to be full sentences with punctuation.

I also fixed the ruff issues

I will point out that the current version of this extension (heavily reliant on table structures) would benefit quite a bit from an old feature request for HDMF; the concept of unit-valued columns (working title 'MeasurementData': https://github.com/hdmf-dev/hdmf/pull/688). Basically, all units at the moment are defined in the schema docs instead of explicitly as attributes on the object.

CodyCBakerPhD commented 9 months ago

Thanks @alessandratrapani, this is looking like a really strong extension!

Most of my suggestions at this point are nit-picky organizations for readability on the spec file... let me know if you just want me to go in and make those modifications myself (if you agree with them) otherwise, I made a couple suggestions that can be accepted via the batch commit functionality on GitHub

My last final major consideration of this design relates to the duck typing attitude towards volumetric vs. planar representations

Fields such as the generic sweep_size/sweep_mask and the size of the SpatialLightModulator can be either 2D or 3D; following the logic of the upcoming ndx-microscopy approach, it is more error-prone (and more complicated to program validation checks for) to have a single object that have variable dimension.

For re-use it also requires more understanding and education for users to know to look for, expect, and program cases and exceptions for (example: the first file a user finds with this extension has this field be 2D, then finds another future file where the same field is 3D; code built to meta-analyze structures of the first file would fail unexpectedly). It is more explicit to instead have two separate types, one of which is explicitly volumetric and only takes 3D, the other being explicitly 2D.

add validation for power,frequency and pulse_width dimension (maybe in a follow-up PR)

Definitely going to need this; could you get it started as a splinter branch using this PR as a base? No doubt it will also require some Inspector checks for files produced via MatNWB instead

pauladkisson commented 9 months ago

The original intent here was to have two PlaneSegmentation tables; one for post-hoc ROI (possibly cell) identification, by software such as Suite2P; the other for targeted ROIs. Additional columns on both tables can indicate if the ROI is a cell, and the two tables can be harmonized with the use of a global_roi_id field that matches ROI IDs from one table to the other.

I agree with this approach. It seems very likely that not everyone will target ROIs based on centroids. My main point from earlier is that post hoc segmentation-based ROIs cannot be the primary representation. Two PlaneSegmentation tables seems like a good generalizable approach.

alessandratrapani commented 9 months ago

This sounds like a good idea in the absence of full masks for region targeted; I would however recommend double checking if the approach sounds good with both Pinto and Histed (i.e., if they have ready access to these centroids as opposed to the target masks)

Yes, they do have centroids for targeted rois.

I will point out that even if a full mask is not known (as in the case of Adesnik manual centroids), this is equivalent to a pixel_mask of length 1.

Thank you so much for pointing this out! I didn't think about it. Then, I will define the targeted_rois as a DynamicTableRegion as well, pointing to a PlaneSegmentation for targeted ROIs. I will add in the description that they can be defined with a one-pixel mask. I do agree that having the global_roi_id field that matches ROI IDs from one table to the other, is the most consistent way.

I would add a brief section about this to the README, with instructions to raise an issue for a future user to request this feature

Ok, will do. Thanks for the suggestion!

alessandratrapani commented 9 months ago

My last final major consideration of this design relates to the duck typing attitude towards volumetric vs. planar representations

Fields such as the generic sweep_size/sweep_mask and the size of the SpatialLightModulator can be either 2D or 3D; following the logic of the upcoming ndx-microscopy approach, it is more error-prone (and more complicated to program validation checks for) to have a single object that have variable dimension.

Ok, then would it be ok to replace SpatialLightModulator with 2 different devices, one called 2DSpatialLightModulator and the other 3DSpatialLightModulator. Same for the OptogeneticStimulusPattern --> 2DOptogeneticStimulusPattern and 3DOptogeneticStimulusPattern?

alessandratrapani commented 9 months ago

Definitely going to need this; could you get it started as a splinter branch using this PR as a base? No doubt it will also require some Inspector checks for files produced via MatNWB instead

Sure

alessandratrapani commented 9 months ago

Thank you so much, @CodyCBakerPhD, for all of these corrections. I appreciate it a lot!

CodyCBakerPhD commented 9 months ago

Ok, then would it be ok to replace SpatialLightModulator with 2 different devices, one called 2DSpatialLightModulator and the other 3DSpatialLightModulator. Same for the OptogeneticStimulusPattern --> 2DOptogeneticStimulusPattern and 3DOptogeneticStimulusPattern?

Unfortunately, class names in Python can't start with numerical (only alphabetical) characters 😆 otherwise, that's the idea

Either

(i) SpatialLightModulator2D / SpatialLightModulator3D

or for more consistency with what we expect from ndx-microscopy

(ii) PlanarSpatialLightModulator (or anything similar for referring to a 2D plane instead of a 3D volume) / VolumetricSpatialLightModulator

catalystneuro / ndx-patterned-ogen

Add specification for an improved vesion of ndx-patterned_ogen #2