NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

[Bug]: Unable to read an NWB file generated from OpenEphys #492

Open himahuja opened 1 year ago

himahuja commented 1 year ago

What happened?

Using NWB version 2.6.0 release to open an NWB file with schema version 2.5.0, and unable to read it because of some unidentified fields. Also, the file could be read using PyNWB and viewed in HDFViewer.

Steps to Reproduce

nwbRead('experiment1.nwb')

Error Message

Error using types.util.checkUnset (line 13)
Unexpected properties
{channel_conversion_description num_samples sync
sync_interval sync_unit}.

Your schema version may be incompatible with the
file.  Consider checking the schema version of the
file with `util.getSchemaVersion(filename)` and
comparing with the YAML namespace version present in
nwb-schema/core/nwb.namespace.yaml

Error in types.core.ElectricalSeries (line 40)
            types.util.checkUnset(obj,
            unique(varargin(1:2:end)));

Error in io.parseGroup (line 85)
    parsed = eval([Type.typename '(kwargs{:})']);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group,
    Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group,
    Blacklist);

Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename),
Blacklist);

Operating System

Windows

Matlab Version

2020a

Code of Conduct

lawrence-mbf commented 1 year ago

There doesn't seem to be a channel_conversion_description under ElectricalSeries in either schema. Can you ensure that this is definitely the ElectricalSeries as defined in core or some other extension that is including the ElectricalSeries class?

MatNWB cannot represent classes that deviate from the schema spec without an extension class.

If you can, could you provide a file representative of these issues?

rly commented 1 year ago

@himahuja how did you generate this file?

himahuja commented 1 year ago

I checked the open ephys documentation regarding their NWB plugin, and I can find channel conversion field in their documentation.

Each spikes group is an NWB SpikeEventSeries containing the following datasets:

data: array with dimensions S spikes x N channels x M samples containing the spike waveforms. The channel_conversion attribute stores the “bitVolts” value required to convert these values into microvolts (headstage channels) or volts (ADC channels).

https://open-ephys.github.io/gui-docs/User-Manual/Recording-data/NWB-format.html

I am also allowing access for the Box file for the data here. It's 20G.

oruebel commented 1 year ago

I checked the open ephys documentation regarding their NWB plugin, and I can find channel conversion field in their documentation.

@jsiegle @medengineer this question is related to the OpenEphys NWB plugin. Do you have any thoughts or comments?

lawrence-mbf commented 1 year ago

This is interesting.

The specific error appears to be an unexpected description attribute inside the channel_conversion dataset. This is not defined in the 2.5.0 version of the schema (the only attribute should be axis).

oruebel commented 1 year ago

The OpenEphys plugin implements it's own write in C++ (i.e., it doesn't use PyNWB or MatNWB) so I think this could be some inconsistencies that may need to be addressed.

jsiegle commented 1 year ago

We've checked these files with the NWB validator and confirmed they can be read using PyNWB, but we haven't tested MatNWB yet. Looking at the source code, it's clear that we've forgotten to include the neurodata_type attribute for the channel_conversion dataset. We will add that, make sure that fixes the MatNWB loading issue, and put out a new release of the plugin.

oruebel commented 1 year ago

We've checked these files with the NWB validator and confirmed they can be read using PyNWB, but we haven't tested MatNWB yet. Looking at the source code, it's clear that we've forgotten to include the neurodata_type attribute

channel_conversion was added as a dataset in ElectricalSeries in NWB v2.2. The dataset is part of ElectricalSeries but does not have it's own type. I.e, I believe only ElectricalSeries should have the namespace and neurodata_type attribute, i.e., instead of adding neurodata_type attribute to channel_conversion you may need remove the namespace attribute from channel_conversion.

Normally, the validator should warn for extra attributes, but since namespace and neurodata_type are special, reserved attributes I think they might be ignored by the validator. This may be an issue for PyNWB validator to make extra neurodata_type and namespace attributes are being detected for objects that do not have an assigned neurodata_type in the schema. @rly what do you think?

jsiegle commented 1 year ago

We removed the extra attributes in the channel_conversion dataset, which fixed original error. But then the loading failed due to the presence of additional custom attributes.

We haven't embedded the schema in these files, will doing so fix these errors?

Alternatively, is it possible to make matnwb behave similarly to pynwb in terms of permissible file contents?

cc'ing @medengineer, who has been doing this testing.