NeurodataWithoutBorders / matnwb

A Matlab interface for reading and writing NWB files
BSD 2-Clause "Simplified" License
48 stars 32 forks source link

[Bug]: File generated with MATNWB 2.4.0 issue #603

Open SPMMER opened 1 month ago

SPMMER commented 1 month ago

What happened?

We created our NWB files with MATNWB 2.4.0 and now we try to validate it with NWBInspector. It trows the following error:

Steps to Reproduce

(NWBinspector) PS C:\Users\Stefan\PycharmProjects\NWBinspector> nwbinspector C:\Users\Stefan\Desktop\NWBData_Tes
t\
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'hdmf-common' version 1.5.0 because version 1.8.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'core' version 2.4.0 because version 2.7.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'hdmf-experimental' version 0.1.0 because version 0.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
Traceback (most recent call last):
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\Scripts\nwbinspector.exe\__main__.py", line 7, in <module>     
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1157, in __call__       
    return self.main(*args, **kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\nwbinspector\_inspection_cli.py", line 121, i
n _inspect_all_cli
    messages = list(
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\nwbinspector\_inspection.py", line 139, in in
spect_all
    with pynwb.NWBHDF5IO(path=nwbfile_path, mode="r", load_namespaces=True, driver=driver) as io:
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\pynwb\__init__.py", line 309, in __init__    
    super().load_namespaces(tm, path, file=file_obj, driver=driver, aws_region=aws_region)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\backends\hdf5\h5tools.py", line 188, in 
load_namespaces
    return cls.__load_namespaces(namespace_catalog, namespaces, open_file_obj)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\backends\hdf5\h5tools.py", line 222, in 
__load_namespaces
    d.update(namespace_catalog.load_namespaces(cls.__ns_spec_path, reader=reader))
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\build\manager.py", line 482, in load_nam
espaces
    deps = self.__ns_catalog.load_namespaces(**kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 541, in load_na
mespaces
    ret[ns['name']] = self.__load_namespace(ns, reader, resolve=resolve)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 447, in __load_
namespace
    self.__load_spec_file(reader, s['source'], catalog, types_to_load=types_to_load, resolve=resolve)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 396, in __load_
spec_file
    temp_dict = {k: None for k in __reg_spec(self.__dataset_spec_cls, spec_dict)}
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 387, in __reg_s
pec
    spec_obj = spec_cls.build_spec(spec_dict)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\spec.py", line 102, in build_spec  
    return cls(**kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 667, in func_call       
    pargs = _check_args(args, kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 660, in _check_args     
    raise ExceptionType(msg)
TypeError: NWBDatasetSpec.__init__: missing argument 'doc'

Traceback

(NWBinspector) PS C:\Users\Stefan\PycharmProjects\NWBinspector> nwbinspector C:\Users\Stefan\Desktop\NWBData_Tes
t\
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'hdmf-common' version 1.5.0 because version 1.8.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'core' version 2.4.0 because version 2.7.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py:535: UserWarning: Ignoring cac
hed namespace 'hdmf-experimental' version 0.1.0 because version 0.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
Traceback (most recent call last):
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\Scripts\nwbinspector.exe\__main__.py", line 7, in <module>     
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1157, in __call__       
    return self.main(*args, **kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\nwbinspector\_inspection_cli.py", line 121, i
n _inspect_all_cli
    messages = list(
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\nwbinspector\_inspection.py", line 139, in in
spect_all
    with pynwb.NWBHDF5IO(path=nwbfile_path, mode="r", load_namespaces=True, driver=driver) as io:
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\pynwb\__init__.py", line 309, in __init__    
    super().load_namespaces(tm, path, file=file_obj, driver=driver, aws_region=aws_region)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\backends\hdf5\h5tools.py", line 188, in 
load_namespaces
    return cls.__load_namespaces(namespace_catalog, namespaces, open_file_obj)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\backends\hdf5\h5tools.py", line 222, in 
__load_namespaces
    d.update(namespace_catalog.load_namespaces(cls.__ns_spec_path, reader=reader))
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\build\manager.py", line 482, in load_nam
espaces
    deps = self.__ns_catalog.load_namespaces(**kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 668, in func_call       
    return func(args[0], **pargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 541, in load_na
mespaces
    ret[ns['name']] = self.__load_namespace(ns, reader, resolve=resolve)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 447, in __load_
namespace
    self.__load_spec_file(reader, s['source'], catalog, types_to_load=types_to_load, resolve=resolve)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 396, in __load_
spec_file
    temp_dict = {k: None for k in __reg_spec(self.__dataset_spec_cls, spec_dict)}
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\namespace.py", line 387, in __reg_s
pec
    spec_obj = spec_cls.build_spec(spec_dict)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\spec\spec.py", line 102, in build_spec  
    return cls(**kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 667, in func_call       
    pargs = _check_args(args, kwargs)
  File "C:\Users\Stefan\.conda\envs\NWBinspector\lib\site-packages\hdmf\utils.py", line 660, in _check_args     
    raise ExceptionType(msg)
TypeError: NWBDatasetSpec.__init__: missing argument 'doc'

Operating System

Windows

Python Executable

Conda

Python Version

3.11

Package Versions

No response

Code of Conduct

rly commented 1 month ago

@SPMMER Does your file use any NWB extensions? It looks like there is an issue with one of the cached specs. Would you be able to share the file with us to investigate?

Also, what version of nwbinspector are you using? You can run pip show nwbinspector to get that.

SPMMER commented 1 month ago

No extensions that I know of. They are made with the nwb 2.4.0 schema using generateCore('2.4.0') command.

The nwb inspector version is 0.5.2.

I shared the file with @mavaylon1 so maybe he can upload that.

stephprince commented 1 month ago

@SPMMER I have the USB flash drive with the data so I can take a closer look after SFN.

stephprince commented 1 month ago

@SPMMER Could you provide some additional details about how these files were generated?

I took a closer look at the file you shared, and it looks like there is an additional cached spec in the specifications group (/specifications/mss) that is causing the error you described. The mss.multishapes schema is used in unit tests in matnwb, but it is not clear to me why it is being included in this file.

After removing this cached schema, there are still a couple of separate errors when reading with pynwb for validation (e.g. the length of the row ids in the general/intracellular_ephys/intracellular_recordings/ and general/intracellular_ephys/intracellular_recordings/protocol_type groups do not match the length of the their corresponding values).

It would be helpful to know how these files were created so we can figure out where to address these issues.

SPMMER commented 1 month ago

@stephprince Thank you for looking into that further.

It is not clear to me as to why this schema would be included.

The files were generated with a custom Matlab script that filters our experiment data for a certain subset of recordings and then builds the nwb file based on the documentation provided in the tutorial here (https://neurodatawithoutborders.github.io/matnwb/tutorials/html/intro.html). We are converting the data from cfs files (CED Signal) and DAT files (HEKA Patchmaster).

ehennestad commented 1 month ago

There is an issue with matnwb where all cached namespaces are included in the file, independent of whether types from that namespace are used or not.

That would explain why the mss is included.

stephprince commented 1 month ago

Thanks @ehennestad for the info!

If I understand correctly, it looks like most of the errors I ran into when trying to validate the file in pynwb were fixed in later versions of MatNWB:

@SPMMER is it possible for you to regenerate the files with a newer version of MatNWB and the newer tutorials? I hacked together a way to modify the original file you shared, but it would probably be better to use an updated version of the MatNWB package and tutorial.

As a summary, after I modified the file to address the issues listed below, I was able to successfully run the NWBInspector on the file:

Since I believe any further issues will be addressed on the MatNWB side, I propose we transfer this issue to that repo.