NeurodataWithoutBorders / nwb-schema

Data format specification schema for the NWB neurophysiology data format
http://nwb-schema.readthedocs.io
Other
52 stars 16 forks source link

unreleased versions of nwb-schema #420

Open bendichter opened 4 years ago

bendichter commented 4 years ago

I am trying to read this file using matnwb. It does not work with the current schema, due to backward-incompatible changes to the schema since the file was written.

>> nwbRead('/Users/bendichter/Downloads/EC9_B53.nwb')
Error using types.util.checkUnset (line 13)
Properties {help} are not valid property names.

Error in types.core.DynamicTableRegion (line 26)
            types.util.checkUnset(obj,
            unique(varargin(1:2:end)));

Error in io.parseDataset (line 76)
    parsed = eval([typename '(kwargs{:})']);

Error in io.parseGroup (line 22)
    dataset = io.parseDataset(filename, datasetInfo,
    fullPath, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in io.parseGroup (line 38)
    subg = io.parseGroup(filename, group, Blacklist);

Error in nwbRead (line 33)
nwb = io.parseGroup(filename, h5info(filename),
Blacklist);

I would like to try reading the file with the schema that was used to generate the file. The file records that it was written using nwb version 2.0.2, but there is no 2.0.2 release on the nwb-schema release page. Without that, it's hard for me to find the commit that corresponds to that version number.

rly commented 4 years ago

Do you know whether the file was written with PyNWB or MatNWB? And which part of the file says that it is version 2.0.2? The version in nwb.namespace.yaml, the version in nwb.file.yaml, or both? There were some unfortunate inconsistencies with file versioning last year before version 2.1.0.

There is a brief mention of 2.0.2 here: https://nwb-schema.readthedocs.io/en/latest/format_release_notes.html#june-2019

This was when we updated the schema first in pynwb then migrated it to nwb-schema. In PyNWB, the version was updated from 2.0.1/2.0b to 2.0.2 here: https://github.com/NeurodataWithoutBorders/pynwb/commit/d2c835cf62f164d4aaf2a0662e6507a0a13faa62

In PyNWB, the version in nwb.namespace.yaml was updated from 2.0.2 to 2.1.0 here: https://github.com/NeurodataWithoutBorders/pynwb/commit/2c2831f318a98ffd1bf8989ec35897360c73b09c but I had forgotten to update the version in nwb.file.yaml.

The schema was copied from PyNWB to nwb-schema, and the version in nwb.namespace.yaml was updated from 2.0.1 to 2.1.0 and the version in nwb.file.yaml was updated from 2.0b to 2.0.2 here: https://github.com/NeurodataWithoutBorders/nwb-schema/commit/dba02ff7521fab07e0f2f1c8356eada8d6b52c20

Immediately afterward, nwb-schema became a git submodule of PyNWB and the version in nwb.file.yaml was fixed to be 2.1.0 here: https://github.com/NeurodataWithoutBorders/nwb-schema/commit/b8a3821b8d28aacc82735acc01812441ff3df0e3

So I suspect that the file was generated using the dev branch of PyNWB sometime between June 11 and August 20, 2019...

oruebel commented 4 years ago

Ryan, thanks for chasing this down. This seems to raise the question whether users should be allowed to create production files from the dev branch. I.e., maybe the schema version on the dev branch should always have some form of stamp to indicate that this is a development version not a stable version.

rly commented 4 years ago

@oruebel That would help with debugging but won't stop users from creating production files from the dev branch. I don't know about the particular situation regarding the file that @bendichter has, but one concern is that naive users clone the pynwb repo and create production files from there as opposed to using the pip release, or developers accidentally create production files from the dev branch. One thing we could do is to warn whenever the version of pynwb used to create a file does not match x.y.z. That warning could get frustrating for users/developers though.

oruebel commented 4 years ago

That warning could get frustrating for users/developers though.

One compromise would be to have that warning be generated once on the first import of PyNWB. I don't think that should be too annoying and should hopefully also not mess with tests since the import happens outside of the test cases.

bendichter commented 4 years ago

/nwb_version is the official place for the nwb version, and that is the place I am referring to. The schema is not cached, so neither nwb.namespace.yaml nor nwb.file.yaml were present in the file. If the schema were cached, I would not need to rely on a released schema in the first place and would generate the classes from the cached schema. I'm not sure how the location present relates to the schema version locations in the schema.

Thanks for the links. I'll try reading with those states of the schema and see if I can make progress.

I think Oliver is referring to always using a released version of nwb-schema on the pynwb dev branch which I agree with and which I believe we already do (aside from this instance we are discussing).

rly commented 4 years ago

Ah, right, the schema was not cached. The /nwb_version attribute comes from nwb.file.yaml (it is a fixed attribute in the definition of NWBFile). So, the file must have been generated using the dev branch of PyNWB sometime between June 11 and August 20, 2019.

I agree - we should always have a released version of nwb-schema on the dev branch. So, yes, then a warning would not be necessary.