Closed emilyasterjones closed 1 year ago
Another example: if you run the MatNWB tutorial, you get the following related errors:
[root/general/experimenter (general/experimenter): incorrect shape - expected an array of shape '[None]',
got non-array data 'My Name',
root/general/related_publications (general/related_publications): incorrect shape - expected an array of shape '[None]',
got non-array data 'DOI:10.1016/j.neuron.2016.12.011']
reproduced:
nwb = NwbFile( ...
'session_description', 'mouse in open exploration',...
'identifier', 'Mouse5_Day3', ...
'session_start_time', datetime(2018, 4, 25, 2, 30, 3), ...
'general_experimenter', 'My Name' ...
);
nwbExport(nwb, 'experimenter_test.nwb');
from pynwb import NWBHDF5IO, validate
io = NWBHDF5IO('/Users/bendichter/dev/matnwb/experimenter_test.nwb','r')
validate(io)
[root/general/experimenter (general/experimenter): incorrect shape - expected an array of shape '[None]', got non-array data 'My Name']
related to #338
The schema states general_experimenter should be a text field
I'm not sure if this is really an error in MatNWB itself or whether this is an error in the tutorials. According to the schema experimenter
and related_publications
should be arrays of strings. These fields used to be single strings but were changed to arrays of strings in version 2.1 of the schema.
/general
group: https://nwb-schema.readthedocs.io/en/latest/format.html?highlight=experimenter#groups-general Either way, it doesn't work if you only have a single entry. This:
nwb = NwbFile(...
'general_experimenter', {'Emily Aery Jones'},...
%other required info here
);
still returns the error. You have to have at least 2 entries in that cell array or else it converts it to a string.
@bendichter digging into this. The issue seems to be with MATLAB's HDF5 C library. Up until the point the general_experimenter
data is passed into the H5ML.hdf5lib2 MEX function, the variable remains a cell array. This occurs in line 100 of MATLAB's H5D.write function.
One potential work-around on the MatNWB side might be to wrap the cell array with another cell array when the length is 1, before passing it off to the H5D write function. I haven't yet tested this. Any other ideas?
EDIT: I tested the above idea in a quick way and it doesn't work. You still end up with a string.
I'm wondering if it might not be easiest to modify the pynwb function to accept string or arrays. Albeit, it's not the most elegant solution.
I'm wondering if it might not be easiest to modify the pynwb function to accept string or arrays
For the case of general/experimenter
I believe PyNWB does accept both string or array of strings, however, this is only because that field used to be just a string in NWB 2.0 and was changed to an array of strings in NWB 2.1, so there is extra logic for backward compatibility for this field. However, for other array of string datasets, PyNWB may not accept a scalar string on read if the schema specifies it as an array of strings. Either way, even if made sure that PyNWB can work around this issue on read, the file would still not pass validation. Because of this, I believe we should aim to fix this here.
With the merging of #346 validation appears to not error unless I'm doing something wrong here.
This is using the following snippet for file creation:
nwb = NwbFile( ...
'session_description', 'mouse in open exploration',...
'identifier', 'Mouse5_Day3', ...
'session_start_time', datetime(2018, 4, 25, 2, 30, 3), ...
'general_experimenter', {'My Name'} ...
);
nwbExport(nwb, 'experimenter_test.nwb');
and the following to read/validate:
from pynwb import NWBHDF5IO, validate
io = NWBHDF5IO('./experimenter_test.nwb','r')
validate(io)
Closed as probably fixed but has no confirmation.
The schema states general_experimenter should be a text field, but validation fails unless it's a string array. Related to this issue in pyNWB.
leads to
A workaround: