NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
49 stars 32 forks source link

Embed schema in NWB file #231

Closed bahanonu closed 4 years ago

bahanonu commented 4 years ago

Is there a way to embed the NWB schema within the NWB file as it is saved so that the files can be opened in newer versions more seamlessly? nwbRead contains the following lines that indicate it is able to read embedded specifications and generate appropriate MATLAB classes: https://github.com/NeurodataWithoutBorders/matnwb/blob/master/nwbRead.m#L39-L47. I am particularly interested in using this for ophys raw and processed data (using matnwb api); though I assume any solution should work generally.

For example, I have a file created with NWB 2.0.2 (checked with util.getSchemaVersion) but I am running NWB 2.2.5. Thus, if I try to read the file:

nwb = nwbRead('path_to_file.nwb');

I get the following error:

Error using types.util.checkUnset (line 13)
Unexpected properties {help}.

Your schema version may be incompatible with the file.  Consider checking the
schema version of the file with `util.getSchemaVersion(filename)` and comparing
with the YAML namespace version present in nwb-schema/core/nwb.namespace.yaml
Error in types.core.ImageSeries (line 38)
            types.util.checkUnset(obj, unique(varargin(1:2:end)));
Error in io.parseGroup (line 85)
    parsed = eval([Type.typename '(kwargs{:})']);
Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);
Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);
Error in nwbRead (line 33)
nwb = io.parseGroup(filename, h5info(filename), Blacklist); 

Would want this to be seamless so future users would not have to generate or find the schema that was used to generate that file before reading it.

This appears similar to the issue at https://github.com/NeurodataWithoutBorders/matnwb/issues/224.

lawrence-mbf commented 4 years ago

Can you please clarify what files you are reading to run into this error?

The error you're seeing is most likely from a file that simply does not actually have an embedded specification. It could also be a MATLAB path conflict where you are referring to previously generated classes.

util.getSchemaVersion only checks the nwb_version string but it does not check if an embedded spec actually exists in the file itself.

bahanonu commented 4 years ago

Thanks, but my question was specifically about embedding the specification/schema, the file is just an example for motivation as to why. Where are we able to let matnwb API know to embed the specification before using nwbExport?

re: the example file: @bendichter Can you give @ln-vidrio the source of Sue_2x_3000_40_-46.nwb and Sue_2x_3000_40_-46_CNMF_estimates.nwb files you sent me?

This is not a path issue as I only point to a single matnwb directory and normally clear and reset paths to avoid that issue; I assume the nwb_version should match the embedded specification (if present) hence using util.getSchemaVersion. These files do not have a .specloc attribute that nwbRead looks for (or a /specifications group) in contrast to, say, the Steinmetz 2019 nwb files (https://figshare.com/articles/steinmetz/11274968).

lawrence-mbf commented 4 years ago

Specification embedding should be automatic in MatNWB so if you're using that to create NWB files then that may be a bug and I would like to see either the source or the files. The NwbFile's export() function embeds a specification automatically and does not require an API call: https://github.com/NeurodataWithoutBorders/matnwb/blob/5f203ad664d4068182975554eaaefd6cc94ba19b/NwbFile.m#L57

nwb_version predates the embedded spec feature so it does not necessarily indicate that an embedded schema exists. A file without .specloc is a file without an embedded spec and there is no currently supported method for customizing embedded spec locations.

If the source (either an older version of MatNWB or an older version of pynwb) did not embed a specification then the only way to add that back in is to rewrite the file with a version of MatNWB that does. Note that this has no bearing on the NWB schema version that is used for embedding.

bahanonu commented 4 years ago

@ln-vidrio Thanks for the additional explanations. See pull request at https://github.com/NeurodataWithoutBorders/matnwb/pull/232 that deals with a potential bug with specification embedding.