AllenInstitute / OpenScopeNWB

Other
2 stars 0 forks source link

Idiosyncracies and Problems with 2P Files on DANDI #132

Open rcpeene opened 1 year ago

rcpeene commented 1 year ago

Running my notebook "visualizing 2p responses to stimulus" I've run into a few problems with analysis. They are probably workable, but they are incompatible with the code for dendritic coupling's NWB files

ValueError                                Traceback (most recent call last)
Cell In[26], line 3
      1 stim_table = nwb.intervals["trials"]
      2 print(stim_table.colnames)
----> 3 stim_table[:]

File [c:\Users\carter.peene\AppData\Local\Programs\Python\Python39\lib\site-packages\hdmf\common\table.py:854](file:///C:/Users/carter.peene/AppData/Local/Programs/Python/Python39/lib/site-packages/hdmf/common/table.py:854), in DynamicTable.__getitem__(self, key)
    853 def __getitem__(self, key):
--> 854     ret = self.get(key)
    855     if ret is None:
    856         raise KeyError(key)

File [c:\Users\carter.peene\AppData\Local\Programs\Python\Python39\lib\site-packages\hdmf\common\table.py:912](file:///C:/Users/carter.peene/AppData/Local/Programs/Python/Python39/lib/site-packages/hdmf/common/table.py:912), in DynamicTable.get(self, key, default, df, index, **kwargs)
    908         return default
    909 else:
    910     # index by int, list, np.ndarray, or slice -->
    911     # return pandas Dataframe or lists consisting of one or more rows
--> 912     sel = self.__get_selection_as_dict(key, df, index, **kwargs)
    913     if df:
    914         # reformat objects to fit into a pandas DataFrame
    915         if np.isscalar(key):

File [c:\Users\carter.peene\AppData\Local\Programs\Python\Python39\lib\site-packages\hdmf\common\table.py:963](file:///C:/Users/carter.peene/AppData/Local/Programs/Python/Python39/lib/site-packages/hdmf/common/table.py:963), in DynamicTable.__get_selection_as_dict(self, arg, df, index, exclude, **kwargs)
    961         raise IndexError(msg) from ve
...
File h5py\_objects.pyx:55, in h5py._objects.with_phil.wrapper()

File h5py\h5d.pyx:299, in h5py.h5d.DatasetID.get_space()

ValueError: Invalid dataset identifier (invalid dataset identifier)
rcpeene commented 1 year ago

Worth noting, these don't appear to be a problem for Differential encoding dataset

jeromelecoq commented 1 year ago

Check stimulus table instead of trial table.

Key issue : Maybe we need an update to the file.

jeromelecoq commented 1 year ago

Thi sis relevant her e:

—stimulus | Group | Data pushed into the system (eg, video stimulus, sound, voltage, etc) and secondary representations of that data (eg, measurements of something used as a stimulus). This group should be made read-only after experiment complete and timestamps are corrected to common timebase. Stores both presented stimuli and stimulus templates, the latter in case the same stimulus is presented multiple times, or is pulled from an external stimulus library. Stimuli are here defined as any signal that is pushed into the system as part of the experiment (eg, sound, video, voltage, etc). Many different experiments can use the same stimuli, and stimuli can be re-used during an experiment. The stimulus group is organized so that one version of template stimuli can be stored and these be used multiple times. These templates can exist in the present file or can be linked to a remote library file.Name: stimulus —intervals | Group | Experimental intervals, whether that be logically distinct sub-experiments having a particular scientific goal, trials (see trials subgroup) during an experiment, or epochs (see epochs subgroup) deriving from analysis of data.Quantity: 0 or 1Name: intervals

rcpeene commented 1 year ago

Current summary of the files

differential encoding
 - dff
    uses "dff" key
 - missing projection images
 - lab_meta_data field is empty
 - intervals: trials table not thorough, maybe should include column linking to stimulus object?
 - stim names are much too long and unclear
    stim time series doesn't include any epoch/stim info, just timestamps and indices
 - no eye tracking data
 - running data is stored unusually. with a path like 
 - nwb.processing["behavior"].data_interfaces["BehavioralTimeSeries"].time_series["running_velocity"]

periodic stimulation
 - dff
    uses "DfOverF" key
    uses "RoiResponseSeries" as response series key instead of "traces"
    ~cant access data! (invalid dataset identifer, closed HDF5dataset))~
 - missing projection images
 - lab_meta_data field is empty
 - intervals: trials table perhaps not thorough?
    not displayable as a table for some reason. 
    only has three columns (start time, stop time, stimulus) perhaps more descriptive name for stimulus and more columns?
    cant seem to access the data with indexing (invalid dataset identifier)
 - no stimulus object
 - no eye tracking data
 - running data is stored unusually. with a path like nwb.processing["behavior"].data_interfaces["BehavioralTimeSeries"].time_series["running_velocity"]

stimulus evoked differentiation
 - dff
    uses "DfOverF" key
    uses "imaging_plane_1" as roi_response_series key instead of "traces"
    cant access data! (invalid dataset identifer, closed HDF5dataset)
        there is an additional timestamp element (1 more than dff), axes are swapped?
 - missing projection images
 - lab_meta_data field is empty 
 - no stim table at all (not even 'trials')
 - stimulus object and keys are good 
 - no eye tracking data
rcpeene commented 1 year ago

The "closed HDF5 Dataset" is not an erroneous property of Jeromes files, but a bug with PyNWB and HDMF versions described here. Will discuss solutions to it. For now, reverting my HDMF version to 3.4

jeromelecoq commented 1 year ago

@rcpeene will adapt the response notebook to differentiation.

rcpeene commented 1 year ago

problem discovered with dandiset 000036.