NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

cell array with single value saved as scalar #338

Closed bendichter closed 2 years ago

bendichter commented 3 years ago

When you save a cell array with a single element that is a string attribute, e.g. {'spike_times'}, this is saved as a scalar, which causes a validation error with pynwb.validate.

This bug has appeared in DynamicTable.colnames, which should be an array of strings. If only a single column is provided, this value is converted to a scalar string and causes a validation error.

This is preventing us from uploading some data generated from MatNWB to DANDI.

spikes = {[1., 2., 3.], [2., 3., 4., 5.]};

[spike_times_vector, spike_times_index] = util.create_indexed_column(spikes);

nwb = NwbFile( ...
    'session_description', 'mouse in open exploration',...
    'identifier', 'Mouse5_Day3', ...
    'session_start_time', datetime(2018, 4, 25, 2, 30, 3) ...
);

nwb.units = types.core.Units( ...
    'colnames', {'spike_times'}, ...
    'description', 'units table', ...
    'id', types.hdmf_common.ElementIdentifiers( ...
        'data', int16(0:length(spikes) - 1) ...
    ), ...
    'spike_times', spike_times_vector, ...
    'spike_times_index', spike_times_index ...
);

nwbExport(nwb, 'colnames_demo.nwb')
from pynwb import validate, NWBHDF5IO

io = NWBHDF5IO('/Users/bendichter/dev/matnwb/colnames_demo.nwb','r')
validate(io)
[Units/colnames (units.colnames): incorrect shape - expected an array of shape '[None]', got non-array data 'spike_times',
 ElementIdentifiers (units/id): incorrect type - expected 'int', got 'int8',
 Units/id (units/id): incorrect type - expected 'int', got 'int8']

@ln-vidrio do you think you could direct us to where in the code we should look to fix this?

bendichter commented 3 years ago

@cechava , we could use your help with this. Would you able to take a stab at it?

lawrence-mbf commented 2 years ago

@bendichter So we wish to sharply distinguish between "scalar" and "non-scalar" strings by their cell-array presence? So a character array being exported should still be a scalar string with symmetric behavior on read?

bendichter commented 2 years ago

@ln-vidrio yes, that's right.