Closed yarikoptic closed 1 year ago
Ironically I cannot reproduce this.
The error that you have may indicate another Path issue. Would it be possible to check that generateCore
is simply never called before reading a NWB file? For your test case, could you also ensure that the +types
module in the MatNWB installation directory only contains +types/+util
and +types/+untyped
?
The error I got speaks to a possible bug in the data generation.
has both
edit 1 : more correct grep for the same statement
❯ grep -e '+types/+util' matnwb-TJq1ZBO-ls.txt | head -n 1
391704 4 drwxr-xr-x 3 dandi dandi 4096 Feb 10 17:36 ./matnwb/+types/+util
❯ grep -e '+types/+untyped' matnwb-TJq1ZBO-ls.txt | head -n 1
391678 4 drwxr-xr-x 3 dandi dandi 4096 Feb 10 17:36 ./matnwb/+types/+untyped
was it using my reproducer?
No, this is also on a Windows machine anyway. To sum up what I did:
+types
only contains +types/+util
and +types/untyped
. This is equivalent to just cloning a new repository in master
nwb = nwbRead('<file location>');
while working directory is in the matnwb install directory. No Path modifications are made here.does matnwb de-reference paths and do something path related later on?
MatNWB only adds to the path using addpath
at most. These are to use the YAML parsing JAR (using javaaddpath
) and to use the ISO8601 external mathworks package (using addpath
). These do not appear to be relevant to your issue though.
Looking at the file listing, the following directories should be deleted in ./matnwb/+types
:
+core
+hdmf_common
+hdmf_experimental
are you suggesting to delete them manually or just stating that something in the generateCore
should be fixed?
FWIW, theoretically the reproducer should work even on Windows happen you install git-annex and datalad in msys2 env... will try later some time to see if there is some hidden gotchas
are you suggesting to delete them manually...
Yes. This issue only came about because generateCore
was called without the custom ./out
directory. I wonder if I should just add a script that clears the namespace for the user. I'd imagine it's not comfortable for people to modify their installation directories like that.
FWIW
~/.cache/matnwb/{version_id_of_some_kind}
savedir
, out
folder for only because were instructed to do so. If it is not necessary/recommended for testing basic functionality of nwbRead
- I would be happy to remove that from our test.With that in mind I have tried without savedir
to the same error as you
(dandisets-2) dandi@drogon:/tmp/matnwb-TJq1ZBO/matnwb$ MATLABPATH=$PWD:$PWD/../out time matlab -nodesktop -batch "f='../000022/sub-744912845/sub-744912845_ses-766640955.nwb'; disp(util.getSchemaVersion(f)); nwb = nwbRead(f)"
2.2.2
Error using assert
Unexpected properties {unit}.
Your schema version may be incompatible with the file. Consider checking the
schema version of the file with `util.getSchemaVersion(filename)` and comparing
with the YAML namespace version present in nwb-schema/core/nwb.namespace.yaml
Error in types.util.checkUnset (line 13)
assert(isempty(dropped),...
Error in types.hdmf_common.VectorData (line 26)
types.util.checkUnset(obj, unique(varargin(1:2:end)));
Error in io.parseDataset (line 81)
parsed = eval([Type.typename '(kwargs{:})']);
Error in io.parseGroup (line 22)
dataset = io.parseDataset(filename, datasetInfo, fullPath, Blacklist);
Error in io.parseGroup (line 38)
subg = io.parseGroup(filename, group, Blacklist);
Error in nwbRead (line 59)
nwb = io.parseGroup(filename, h5info(filename), Blacklist);
Command exited with non-zero status 1
so I guess you got different error because you called differently.
as for this error, seems like other Windows users encounter: https://github.com/NeurodataWithoutBorders/matnwb/issues/492 and I got in https://github.com/NeurodataWithoutBorders/matnwb/issues/490 .
Do you see the same error and can reproduce if you do specify savedir
?
ideally users should not call any script manually
Correct, you do not need generateCore
if you're reading files. This issue only comes out of calling generateCore
, then using savedir
when calling nwbRead
. To fix this issue requires either a script to run or to manually delete +types/+core
, +types/+hdmf_*
, or other extensions which are erroneously generated.
AFAIK we have specified savedir, out folder for only because were instructed to do so...
That's fine. In fact, for this specific purpose of testing multiple files with a single installation, you should continue using savedir
and point to a directory that gets cleaned up after every read. This ensures that the installation directory is not polluted with extraneous or inaccurate files and properly emulates a fresh clone of a repository.
Correct, you do not need generateCore if you're reading files.
well -- the documentation doesn't say that it is "optional" -- https://github.com/NeurodataWithoutBorders/matnwb#step-2-generate-the-api pretty much states to do that. (next step for extensions is listed to be optional so we do not)
That's fine. In fact, for this specific purpose of testing multiple files with a single installation, you should continue using
savedir
and point to a directory that gets cleaned up after every read.
ok, we can keep running generateCore
+ "continue using savedir
" and ensure cleaning it up (saveDir
) after each run. But I am still confused about the
To fix this issue requires either a script to run or to manually delete +types/+core, +types/+hdmf_*, or other extensions which are erroneously generated.
what "script to run"? shouldn't it be fixed within matnwb? Note that, as reproducer shows, we ran for that file in a completely clean fresh installation so there is no "pollution" of any kind.
the documentation doesn't say that it is "optional"
Yeah, that should be clarified. You only need to run generateCore
and generateExtension
if you're writing files. For reading files it's unnecessary anyway.
ok, we can keep running
generateCore
+ "continue usingsavedir
"
I advise not doing that, frankly, as generateCore
is what "pollutes" the namespace with NWB schema 2.6.0 which might not be what the file was written in. That's the primary cause of the issue since that's just how MATLAB path priority works.
Sorry, I need to complete the thought and actually offer something constructive:
Just don't use generateCore
if you're going to use nwbRead(..., 'savedir', ...)
.
Sorry for being a pain, I am just trying to arrive to some logical conclusion of how to instruct users "generally" on how to use matnwb
to load or write .nwb files.
Just don't use
generateCore
if you're going to usenwbRead(..., 'savedir', ...)
.
That would boil to overall distinction of two "modes" (with or without generateCore
) of how matnwb
's nwbRead
should be installed/used (not using/using savedir
), correct?
Most users should be fine with just nwbRead
without savedir
which works just fine with or without generateCore
(assuming a properly embedded schema).
I'd suggest only using savedir
for advanced automation over a lot of NWB files with different schema versions (i.e. our test suite CI). I guess the fundamental question is "where do you want your class files to be?". generateCore
and nwbRead
both do that and if you're not consistent you'll run into these kinds of MATLAB path issues. If you don't care, don't use savedir
.
In the case of your CI, I think my concern was about false negatives where the install directory doesn't really represent a clean git clone
MatNWB installation.
In the case of your CI, I think my concern was about false negatives where the install directory doesn't really represent a clean
git clone
MatNWB installation.
we do it clean (see here) -- remove if prior one exists, download tarball of matnwb from github , extract, do generateCore()
-- everything as the README.md teaches us.
I'd suggest only using
savedir
for advanced automation over a lot of NWB files with different schema versions (i.e. our test suite CI).
That is exactly us - automation over a lot of NWB from different datasets, hence difference schema versions! And how then we should take advantage of savedir
- should we just not do generateCore
and point savedir
to the same location (how does it then work for different versions?)?
anyways , back to current issue, if I do not generateCore
(or to the same effect just git clean -dfx
in existing matnwb checkout) and do not use savedir
,
so the file has 2.2.2
and there seems to be 2.2.2 available within matnwb:
(dandisets-2) dandi@drogon:/tmp/matnwb-TJq1ZBO/matnwb$ ls nwb-schema/
2.0.2 2.1.0 2.2.0 2.2.1 2.2.2 2.2.3 2.2.4 2.2.5 2.3.0 2.4.0 2.5.0 2.6.0
also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?
so the file has 2.2.2 and there seems to be 2.2.2 available within matnwb:
This is not relevant in case of nwbRead
because:
also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?
nwbRead
actually internally calls the equivalent of generateCore
pointing to the embedded schema and all extensions. This is why these pathing issues can easily occur if you're using savedir
.
Looking at the file again, I have to apologize because I thought I had read this file successfully before though I had not. In fact, this error is tied this old issue: #43 where an anonymous VectorData
defined by the Units
table adds fields. Still thinking about fixes there but a cheap workaround is possible to allow for ignoring this class of error possibly.
if you think it is at large duplicate of #43 -- feel free to close. I've subscribed to #43 as well for now.
so the file has 2.2.2 and there seems to be 2.2.2 available within matnwb:
This is not relevant in case of
nwbRead
because:also IIRC there should be a copy of schema within nwb file ... how do we open/load such a file?
nwbRead
actually internally calls the equivalent ofgenerateCore
pointing to the embedded schema and all extensions.
then error message suggesting some incompatibility ("Your schema version may be incompatible with the file.") is even more confusing, and may be this message should be made more specific/adequate to different cases of problems.
Issue now duplicate of #238
What happened?
originally mentioned in https://github.com/NeurodataWithoutBorders/matnwb/issues/490#issuecomment-1422662198
in the fresh run of https://github.com/dandi/dandisets-healthstatus saw in the logs
the error
```shell Asset: sub-744912845/sub-744912845_ses-766640955.nwb Output: {^HCannot define property 'strain' in class 'types.ndx_aibs_ecephys.EcephysSpecimen' because the property has already been defined in the superclass 'types.core.Subject'. Error in io.parseGroup (line 85) parsed = eval([Type.typename '(kwargs{:})']); Error in io.parseGroup (line 38) subg = io.parseGroup(filename, group, Blacklist); Error in io.parseGroup (line 38) subg = io.parseGroup(filename, group, Blacklist); Error in nwbRead (line 59) nwb = io.parseGroup(filename, h5info(filename), Blacklist); }^H ```which then reproduced against master on that sample file
Steps to Reproduce
Error Message
No response
Operating System
Linux
Matlab Version
R2022b Update 1 (9.13.0.2080170) 64-bit (glnxa64)
Code of Conduct