NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
50 stars 32 forks source link

Dash in column name triggers error in nwbRead #198

Closed lvsltz closed 4 years ago

lvsltz commented 4 years ago

I am testing the ndx-icephys-meta extension with matlab. The basic example in the README.md generates a file that can be read with matnwb. But adding a column with a name that contains a dash triggers an error (below). An underscore is fine. I checked the other issues and it seems that dashes appeared before. If I'm not doing something wrong and this is indeed a bug, until it's fixed should I avoid using a specific set of characters?

To test, you can replace the line (A) ir_index = ... with:

ir_table = nwbfile.get_intracellular_recordings()
ir_table.add_column(name='meta-data', description='new metadata', category = 'stimuli')
d = dict()
d['meta-data'] = 2
ir_index = nwbfile.add_intracellular_recording(electrode=electrode,
                                               stimulus=stimulus,
                                               response=response, 
                                               stimulus_metadata=d)

For convenience, I am also attaching the NWB file: test_icephys_file.h5.zip

Versions: pynwb 1.3.0, hdmf 1.6.1, ndx-icephys-meta@8735ec1

Matlab error:

>> nwb=nwbRead('test_icephys_file.h5');

Error using types.ndx_icephys_meta.IntracellularStimuliTable (line 24)
Unmatched parameter name 'meta-data' must be a string scalar or character vector that can
represent a field name.

Error in io.parseGroup (line 78)
    parsed = eval([typename '(kwargs{:})']);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 33)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
lawrence-mbf commented 4 years ago

Yes, the default inputParser uses MATLAB's variable rules for defining variable names. https://www.mathworks.com/help/matlab/matlab_prog/variable-names.html

I will provide a workaround though. Give me a minute

lawrence-mbf commented 4 years ago

@lvsltz Let me know if enh/allow-dash-in-dynamic-props resolves the problem.

lvsltz commented 4 years ago

Thanks! It works indeed this way but I wonder if skipping invalid columns is the right action... instead of a warning+rename.

In any case, I am still struggling to find where the valid columns go (as I am rather new to matnwb) so perhaps I'll provide more feedback once I manage to read a real file, not just a test one. Please let me know if there is any documentation on how to read files with extensions generated with pynwb.

lawrence-mbf commented 4 years ago

The columns are actually still assigned. I've only defaulted the "invalid" columns before running parse on them.

The reason why this workaround works is because the columns will eventually be assigned to a mapping down the line (Anon and Set classes are containers.Map backed). These objects do not need to conform to MATLAB property name conventions. If the generated class doesn't actually have a relevant Anon or Set object, then it will still throw an error.

Where the columns go in a DynamicTable depends on the column's type. If it's a VectorData type, then it will go to the Dynamic Table's vectordata property. If a VectorIndex, then it will be placed in the vectorindex. colnames is the full registry of columns by name.

If the extensions were cached (https://pynwb.readthedocs.io/en/latest/tutorials/general/extensions.html#caching-extensions-to-file), then all you need to do is call nwbRead() on them. If not, the instructions for generateExtension can be found here: https://github.com/NeurodataWithoutBorders/matnwb#step-3-generate-the-api

lvsltz commented 4 years ago

I think the specs are now cached by default in pynwb. So I don't have a problem reading a file, for example the one attached in the OP. What puzzles me is that although the code is generated in +types, none of the new tables actually appear in nwb after nwbRead: general_intracellular_ephys_experimental_conditions, general_intracellular_ephys_intracellular_recordings, general_intracellular_ephys_repetitions etc.

Please let me know if I should open this issue on pynwb or a new one here. It's really a rather different topic than the original question :) @oruebel

nwb = 

  NwbFile with properties:

                                nwb_version: '2.2.2'
                                acquisition: [1×1 types.untyped.Set]
                                   analysis: [0×1 types.untyped.Set]
                           file_create_date: [1×1 types.untyped.DataStub]
                                    general: [0×1 types.untyped.Set]
                    general_data_collection: []
                            general_devices: [1×1 types.untyped.Set]
             general_experiment_description: []
                       general_experimenter: []
                general_extracellular_ephys: [0×1 types.untyped.Set]
     general_extracellular_ephys_electrodes: []
                        general_institution: []
                general_intracellular_ephys: [1×1 types.untyped.Set]
      general_intracellular_ephys_filtering: []
    general_intracellular_ephys_sweep_table: []
                           general_keywords: []
                                general_lab: []
                              general_notes: []
                       general_optogenetics: [0×1 types.untyped.Set]
                     general_optophysiology: [0×1 types.untyped.Set]
                       general_pharmacology: []
                           general_protocol: []
               general_related_publications: []
                         general_session_id: []
                             general_slices: []
                      general_source_script: []
            general_source_script_file_name: []
                           general_stimulus: []
                            general_subject: []
                            general_surgery: []
                              general_virus: []
                                 identifier: 'EXAMPLE_ID'
                                  intervals: [0×1 types.untyped.Set]
                           intervals_epochs: []
                    intervals_invalid_times: []
                           intervals_trials: []
                                 processing: [0×1 types.untyped.Set]
                                    scratch: [0×1 types.untyped.Set]
                        session_description: 'my first recording'
                         session_start_time: 2020-04-07T09:45:12.738717+02:00
                      stimulus_presentation: [1×1 types.untyped.Set]
                         stimulus_templates: [0×1 types.untyped.Set]
                  timestamps_reference_time: 2020-04-07T09:45:12.738717+02:00
                                      units: []
lawrence-mbf commented 4 years ago

My guess is that the fix did drop data somehow. I will look deeper into this.

In the meantime, since experimental_conditions, intracellular_recordings, and repetitions are not part of the regular schema, they will be under the general_intracellular_ephys property as a mapping. In most cases, if you don't see the direct property reference as you've noted, it's usually in the property with the name of the parent directory.

lawrence-mbf commented 4 years ago

I apologize I see the problem. You inherited from NWBFile using your own file representation. By default NwbFile.m is a subclass only of NWBFile which is why these properties were dropped. Let me see if I can adjust for that.

lawrence-mbf commented 4 years ago

In the meantime, the hack is to open NwbFile.m in an editor and change line 1:

classdef NwbFile < types.core.NWBFile

to


classdef NWBFile < types.ndx_icephys_meta.ICEphysFile
lawrence-mbf commented 4 years ago

Let me know if this #201 branch works for you. Basically it's a macro that just edits the file inline and provides a "secondary" NwbFile.m called ICEphysFile.m which is correctly subclassed.

oruebel commented 4 years ago

Thanks for testing things in MatNWB. I hope to be able to merge the extensions with NWB proper in the coming weeks. It will probably take 2-3weeks for this to get finished. Let me know in case there are any blocking issues with this on the MatNWB side. The extension introduces hierarchical tables (i.e., DynamicTables with DynamicTableReference columns) and AlignedDynamicTables (i.e,. a DynamicTable which contains additional tables). While these types effectively just compose existing types, I would like to make sure that this is all working in MatNWB before pushing the extension into NWB proper.

lvsltz commented 4 years ago

I confirm I can properly read the example file. Many thanks for the quick replies!

I still can't read my own data, because of other errors, but I'll open other issues with that, when I figure out what's going on.

lawrence-mbf commented 4 years ago

Feel free to do so!