NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
178 stars 84 forks source link

[Bug]: Unable to Construct SpikeEventSeries #1746

Open lawrence-mbf opened 1 year ago

lawrence-mbf commented 1 year ago

What happened?

Currently converting data to fit DANDI requirement through MatNWB: https://github.com/NeurodataWithoutBorders/matnwb/issues/527

The data reads fine in MatNWB but actually causes the NWB Inspector to throw and error.

The file itself is 1.3 GB so its download link is provided here: https://mbfbioscience-my.sharepoint.com/:u:/p/lawrence/EX0x5jQ9EelAmOXc_hm0eWQBf5GSkY6spXjBhdIUlHAeCg?e=f2g4cq

Steps to Reproduce

See linked issue for the full conversation.

Traceback

C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\spec\namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.5.0 because version 1.7.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\spec\namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-experimental' version 0.1.0 because version 0.4.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
Traceback (most recent call last):
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1256, in construct
    obj = self.__new_container__(cls, builder.source, parent, builder.attributes.get(self.__spec.id_key()),
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1269, in __new_container__
    obj.__init__(**kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\pynwb\ecephys.py", line 125, in __init__
    raise ValueError('Must provide the same number of timestamps and spike events')
ValueError: Must provide the same number of timestamps and spike events

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\Scripts\nwbinspector.exe\__main__.py", line 7, in <module>
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\nwbinspector\nwbinspector.py", line 271, in inspect_all_cli
    messages = list(
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\nwbinspector\nwbinspector.py", line 403, in inspect_all
    nwbfile = robust_s3_read(io.read)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\nwbinspector\utils.py", line 173, in robust_s3_read
    raise exc
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\nwbinspector\utils.py", line 168, in robust_s3_read
    return command(*command_args, **command_kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\pynwb\__init__.py", line 304, in read
    file = super().read(**kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\backends\hdf5\h5tools.py", line 477, in read
    return super().read(**kwargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\backends\io.py", line 60, in read
    container = self.__manager.construct(f_builder)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\manager.py", line 284, in construct
    result = self.__type_map.construct(builder, self, None)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\manager.py", line 795, in construct
    return obj_mapper.construct(builder, build_manager, parent)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1226, in construct
    subspecs = self.__get_subspec_values(builder, self.spec, manager)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1155, in __get_subspec_values
    self.__get_sub_builders(groups, spec.groups, manager, ret)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1206, in __get_sub_builders
    ret.update(self.__get_subspec_values(sub_builder, subspec, manager))
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1155, in __get_subspec_values
    self.__get_sub_builders(groups, spec.groups, manager, ret)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1198, in __get_sub_builders
    sub_builder = self.__flatten(sub_builder, subspec, manager)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1211, in __flatten
    tmp = [manager.construct(b) for b in sub_builder]
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1211, in <listcomp>
    tmp = [manager.construct(b) for b in sub_builder]
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\manager.py", line 280, in construct
    result = self.__type_map.construct(builder, self, parent)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\manager.py", line 795, in construct
    return obj_mapper.construct(builder, build_manager, parent)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\utils.py", line 644, in func_call
    return func(args[0], **pargs)
  File "C:\Users\lawrence\AppData\Local\miniconda3\envs\nwb-inspector\lib\site-packages\hdmf\build\objectmapper.py", line 1260, in construct
    raise ConstructError(builder, msg) from ex
hdmf.build.errors.ConstructError: (root/analysis/unit1SS GroupBuilder {'attributes': {'comments': 'cellType: PCSimpleSpikes', 'description': 'single unit1SS with cell type information and approximate depth', 'namespace': 'core', 'neurodata_type': 'SpikeEventSeries', 'object_id': 'f2d8979b-5db4-484a-8cf6-cff98b1f1ecf'}, 'groups': {}, 'datasets': {'control': root/analysis/unit1SS/control DatasetBuilder {'attributes': {}, 'data': <Closed HDF5 dataset>}, 'control_description': root/analysis/unit1SS/control_description DatasetBuilder {'attributes': {}, 'data': <StrDataset for Closed HDF5 dataset>}, 'data': root/analysis/unit1SS/data DatasetBuilder {'attributes': {'conversion': 1.0, 'offset': 0.0, 'resolution': -1.0, 'unit': 'volts'}, 'data': <Closed HDF5 dataset>}, 'electrodes': root/analysis/unit1SS/electrodes DatasetBuilder {'attributes': {'description': 'Electrodes involved with these spike events', 'namespace': 'hdmf-common', 'neurodata_type': 'DynamicTableRegion', 'object_id': '925752e2-dfdd-4187-86a5-fdbae4eee2f7', 'table': root/general/extracellular_ephys/electrodes GroupBuilder {'attributes': {'colnames': array(['x', 'y', 'z', 'imp', 'location', 'filtering', 'group',
       'group_name'], dtype=object), 'description': 'Electrodes', 'namespace': 'hdmf-common', 'neurodata_type': 'DynamicTable', 'object_id': '17481734-d416-4c94-bea5-7aca144b0753'}, 'groups': {}, 'datasets': {'filtering': root/general/extracellular_ephys/electrodes/filtering DatasetBuilder {'attributes': {'description': 'description of hardware filtering', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': '17bfbc8c-fc2b-4c25-a914-1f0b984fdb71'}, 'data': <StrDataset for Closed HDF5 dataset>}, 'group': root/general/extracellular_ephys/electrodes/group DatasetBuilder {'attributes': {'description': 'a reference to the ElectrodeGroup this electrode is a part of', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': 'b8cfba55-042f-4b4d-a0cb-bf0907fc22ec'}, 'data': <hdmf.backends.hdf5.h5_utils.BuilderH5ReferenceDataset object at 0x000002687FF8A3B0>}, 'group_name': root/general/extracellular_ephys/electrodes/group_name DatasetBuilder {'attributes': {'description': 'the name of the ElectrodeGroup this electrode is a part of', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': '127c63f2-84ee-408f-8e9f-fa0b89251ff6'}, 'data': <StrDataset for Closed HDF5 dataset>}, 'id': root/general/extracellular_ephys/electrodes/id DatasetBuilder {'attributes': {'namespace': 'hdmf-common', 'neurodata_type': 'ElementIdentifiers', 'object_id': 'b535ca2d-853b-4c10-a204-809647262a66'}, 'data': <Closed HDF5 dataset>}, 'imp': root/general/extracellular_ephys/electrodes/imp DatasetBuilder {'attributes': {'description': 'the impedance of the channel', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': '8274ab50-00e6-4051-86c0-b1f6f433f342'}, 'data': <Closed HDF5 dataset>}, 'location': root/general/extracellular_ephys/electrodes/location DatasetBuilder {'attributes': {'description': 'the location of channel within the subject e.g. brain region', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': 'b443f320-d04f-435a-8a1d-f2fa415e0745'}, 'data': <StrDataset for Closed HDF5 dataset>}, 'x': root/general/extracellular_ephys/electrodes/x DatasetBuilder {'attributes': {'description': 'the x coordinate of the channel location', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': '09ce6507-54d3-4225-87da-47210292b911'}, 'data': <Closed HDF5 dataset>}, 'y': root/general/extracellular_ephys/electrodes/y DatasetBuilder {'attributes': {'description': 'the y coordinate of the channel location', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': 'f0df90f3-b78d-4aa4-af00-13924088691c'}, 'data': <Closed HDF5 dataset>}, 'z': root/general/extracellular_ephys/electrodes/z DatasetBuilder {'attributes': {'description': 'the z coordinate of the channel location', 'namespace': 'hdmf-common', 'neurodata_type': 'VectorData', 'object_id': '323e2385-4252-4ebf-a48b-48e621e43239'}, 'data': <Closed HDF5 dataset>}}, 'links': {}}}, 'data': <Closed HDF5 dataset>}, 'timestamps': root/analysis/unit1SS/timestamps DatasetBuilder {'attributes': {'interval': 1, 'unit': 'second'}, 'data': <Closed HDF5 dataset>}}, 'links': {}}, 'Could not construct SpikeEventSeries object due to: Must provide the same number of timestamps and spike events')

Operating System

Windows

Python Executable

Conda

Python Version

3.10

Package Versions

issue_environment.txt

Code of Conduct

lawrence-mbf commented 1 year ago

Note that if the Spike event data is transposed on write, NWB Inspector will not throw an error but will emits this critical message:

UserWarning: SpikeEventSeries 'unit1SS': The second dimension of data does not match the length of electrodes, but instead the first does. Data is oriented incorrectly and should be transposed.
oruebel commented 1 year ago

@ lawrence-mbf could you clarify how this file is being created? From the error it looks like the error is due to PyNWB not being able to open the file due to the error check in SpikeEventSeries.__init__ here https://github.com/NeurodataWithoutBorders/pynwb/blob/f0b3317b1dcd87bc89671145105db2141f23f65d/src/pynwb/ecephys.py#L122-L125

lawrence-mbf commented 1 year ago

This was constructed by a MatNWB script. However, it doesn't appear to satisfy the nwbinspector whichever way the data is transposed, despite the nwbinspector instructions.

I looked through the generated data manually but there doesn't seem to be anything that jumps out as wrong.

CodyCBakerPhD commented 1 year ago

However, it doesn't appear to satisfy the nwbinspector whichever way the data is transposed, despite the nwbinspector instructions.

I downloaded the file myself to check it out; I can clarify the electrodes seems to be the second dimension of those fields in the /analysis group of the file

Just to clarify, it's NWBHDF5IO.read() that is complaining, which can also be verified against more fundamental python -m pynwb.validate BAYLORNL23_20190320-2.6.0.2.newWT.nwb which gives

TimeSeriesReferenceVectorData (intervals/trials/timeseries): incorrect type - expected '['int32', 'int32', 'object']', got '['object', 'int64', 'int64']'
VectorIndex/description (intervals/trials/timeseries_index.description): incorrect type - expected 'text', got 'Empty'
VectorIndex/description (units/spike_times_index.description): incorrect type - expected 'text', got 'Empty'
Units/waveforms (units/waveforms): incorrect type - expected 'numeric', got 'object'
Units/waveforms (units/waveforms): incorrect shape - expected '[None, None]', got '(9,)'
VectorData/description (units/trials_index.description): incorrect type - expected 'text', got 'Empty'

hope that extra info helps solve the problem

oruebel commented 1 year ago

Thanks @CodyCBakerPhD. In terms of read, I suspect that the bad data type for TimeSeriesReferenceVectorData and the units/waveforms are likely the biggest issues.

lawrence-mbf commented 1 year ago

Is the solution then to attempt to fix these other problems and assume the data is correct? We're primarily focused on the /analysis datasets currently but these other issues do crop up on the nwbinspector as well.

Below is a link to the NWB file with non-transposed datasets:

https://mbfbioscience-my.sharepoint.com/:u:/p/lawrence/Ed-AyKzKz49GhEGly4jbRIcB8TJ1RDNfU0RIjxOx1Y21bw?e=0pxroI

This file emits the warnings as mentioned in https://github.com/NeurodataWithoutBorders/pynwb/issues/1746#issuecomment-1656093127

CodyCBakerPhD commented 1 year ago

The .new.nwb (non-transposed) has the following invalidations

python -m pynwb.validate BAYLORNL23_20190320-2.6.0.2.new.nwb

> Validating BAYLORNL23_20190320-2.6.0.2.new.nwb against cached namespace information using namespace 'core'.
 - found the following errors:
TimeSeriesReferenceVectorData (intervals/trials/timeseries): incorrect type - expected '['int32', 'int32', 'object']', got '['object', 'int64', 'int64']'
VectorIndex/description (intervals/trials/timeseries_index.description): incorrect type - expected 'text', got 'Empty'
VectorIndex/description (units/spike_times_index.description): incorrect type - expected 'text', got 'Empty'
Units/waveforms (units/waveforms): incorrect type - expected 'numeric', got 'object'
Units/waveforms (units/waveforms): incorrect shape - expected '[None, None]', got '(9,)'
VectorData/description (units/trials_index.description): incorrect type - expected 'text', got 'Empty'

The validator doesn't seem to check for transposition, but once the file is valid we can try simply reading it with PyNWB, and it is that step that triggers the constructor error or warning