NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

validation requires string array for experimenter, so fails if only one experimenter provided #340

Closed emilyasterjones closed 1 year ago

emilyasterjones commented 3 years ago

The schema states general_experimenter should be a text field, but validation fails unless it's a string array. Related to this issue in pyNWB.

nwb = NwbFile(...
    'general_experimenter', 'Emily Aery Jones',...
    %other required info here
    );

leads to

[root/general/experimenter (general/experimenter): incorrect shape - expected an array of shape '[None]',
 got non-array data 'Emily Aery Jones']

A workaround:

nwb = NwbFile(...
    'general_experimenter', {'Emily Aery Jones',''},...
    %other required info here
    );
emilyasterjones commented 3 years ago

Another example: if you run the MatNWB tutorial, you get the following related errors:

[root/general/experimenter (general/experimenter): incorrect shape - expected an array of shape '[None]',
 got non-array data 'My Name',
 root/general/related_publications (general/related_publications): incorrect shape - expected an array of shape '[None]',
 got non-array data 'DOI:10.1016/j.neuron.2016.12.011']
bendichter commented 3 years ago

reproduced:

nwb = NwbFile( ...
    'session_description', 'mouse in open exploration',...
    'identifier', 'Mouse5_Day3', ...
    'session_start_time', datetime(2018, 4, 25, 2, 30, 3), ...
    'general_experimenter', 'My Name' ...
);

nwbExport(nwb, 'experimenter_test.nwb');
from pynwb import NWBHDF5IO, validate

io = NWBHDF5IO('/Users/bendichter/dev/matnwb/experimenter_test.nwb','r')

validate(io)

[root/general/experimenter (general/experimenter): incorrect shape - expected an array of shape '[None]', got non-array data 'My Name']

bendichter commented 3 years ago

related to #338

oruebel commented 3 years ago

The schema states general_experimenter should be a text field

I'm not sure if this is really an error in MatNWB itself or whether this is an error in the tutorials. According to the schema experimenter and related_publications should be arrays of strings. These fields used to be single strings but were changed to arrays of strings in version 2.1 of the schema.

emilyasterjones commented 3 years ago

Either way, it doesn't work if you only have a single entry. This:

nwb = NwbFile(...
    'general_experimenter', {'Emily Aery Jones'},...
    %other required info here
    );

still returns the error. You have to have at least 2 entries in that cell array or else it converts it to a string.

cechava commented 3 years ago

@bendichter digging into this. The issue seems to be with MATLAB's HDF5 C library. Up until the point the general_experimenter data is passed into the H5ML.hdf5lib2 MEX function, the variable remains a cell array. This occurs in line 100 of MATLAB's H5D.write function.

One potential work-around on the MatNWB side might be to wrap the cell array with another cell array when the length is 1, before passing it off to the H5D write function. I haven't yet tested this. Any other ideas?

EDIT: I tested the above idea in a quick way and it doesn't work. You still end up with a string.

cechava commented 3 years ago

I'm wondering if it might not be easiest to modify the pynwb function to accept string or arrays. Albeit, it's not the most elegant solution.

oruebel commented 3 years ago

I'm wondering if it might not be easiest to modify the pynwb function to accept string or arrays

For the case of general/experimenter I believe PyNWB does accept both string or array of strings, however, this is only because that field used to be just a string in NWB 2.0 and was changed to an array of strings in NWB 2.1, so there is extra logic for backward compatibility for this field. However, for other array of string datasets, PyNWB may not accept a scalar string on read if the schema specifies it as an array of strings. Either way, even if made sure that PyNWB can work around this issue on read, the file would still not pass validation. Because of this, I believe we should aim to fix this here.

lawrence-mbf commented 2 years ago

With the merging of #346 validation appears to not error unless I'm doing something wrong here.

This is using the following snippet for file creation:

nwb = NwbFile( ...
    'session_description', 'mouse in open exploration',...
    'identifier', 'Mouse5_Day3', ...
    'session_start_time', datetime(2018, 4, 25, 2, 30, 3), ...
    'general_experimenter', {'My Name'} ...
);

nwbExport(nwb, 'experimenter_test.nwb');

and the following to read/validate:

from pynwb import NWBHDF5IO, validate

io = NWBHDF5IO('./experimenter_test.nwb','r')

validate(io)
lawrence-mbf commented 1 year ago

Closed as probably fixed but has no confirmation.