NeurodataWithoutBorders / nwbinspector

Tool to help inspect NWB files for compliance with NWB Best Practices
https://nwbinspector.readthedocs.io/
Other
17 stars 10 forks source link

[Bug]: check_resolution is None #198

Closed CodyCBakerPhD closed 2 years ago

CodyCBakerPhD commented 2 years ago

What happened?

2nd error occurring during ros3 inspection of 000065.

000065.txt

Steps to Reproduce

No response

Traceback

0.1.1  sub-Kibbles/sub-Kibbles_behavior+ecephys.nwb and 2 other files: 'None' object with name 'None'
         Message: Traceback (most recent call last):
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/nwbinspector.py", line 503, in run_checks
    output = check_function(nwbfile_object)
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/register_checks.py", line 127, in auto_parse_some_output
    output = check_function(*args, **kwargs)
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/checks/time_series.py", line 87, in check_resolution
    if time_series.resolution != -1.0 and time_series.resolution <= 0:
TypeError: '<=' not supported between instances of 'NoneType' and 'int'

Operating System

Linux

Python Executable

Python

Python Version

3.9

Package Versions

No response

Code of Conduct

CodyCBakerPhD commented 2 years ago

This is a weird one. While it would be possible (and perhaps our responsibility) to prevent this issue on the NWBInspector side, these are some weird TimeSeries objects to begin with. Note another issue here

0.0.0  sub-Kibbles/sub-Kibbles_behavior+ecephys.nwb and 2 other files: 'None' object with name 'None'
         Message: Traceback (most recent call last):
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/nwbinspector.py", line 503, in run_checks
    output = check_function(nwbfile_object)
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/register_checks.py", line 127, in auto_parse_some_output
    output = check_function(*args, **kwargs)
  File "/home/jovyan/GitHub/nwbinspector/nwbinspector/checks/time_series.py", line 37, in check_data_orientation
    if time_series.data is not None and any(np.array(time_series.data.shape[1:]) > time_series.data.shape[0]):
  File "/home/jovyan/my-conda-envs/nwbinspector/lib/python3.10/site-packages/pynwb/base.py", line 233, in data
    if isinstance(self.fields['data'], TimeSeries):
KeyError: 'data'

So these TimeSeries have no data (or at least it's not in the fields) and no resolution (or at least it's not accessible via my_time_series.resolution.

CodyCBakerPhD commented 2 years ago

Given how specific this issue is to however this particular file was made, I think we can safely set this as a low-priority thing to resolve on our end and just assume that a well-written NWBFile has those attributes present.

Do you agree @bendichter?

bendichter commented 2 years ago

Does this file fit the schema? Is TimeSeries.resolution optional?

https://nwb-schema.readthedocs.io/en/latest/format.html#timeseries

It looks like indeed it is. "..resolution" has "required=False". Makes me wonder why we even have default values for resolution. But this file is correct according to the schema, so we should be able to handle it.

CodyCBakerPhD commented 2 years ago

It looks like indeed it is. "..resolution" has "required=False". Makes me wonder why we even have default values for resolution. But this file is correct according to the schema, so we should be able to handle it.

Right, but if you don't specify resolution it defaults to -1.0:

from pynwb import TimeSeries

test = TimeSeries(name="test", data=[], unit="test", rate=1.0)
test.resolution == -1.0  # True

and likewise does not like it if you explicitly specify setting it to None

from pynwb import TimeSeries

test = TimeSeries(name="test", data=[], unit="test", rate=1.0, resolution=None)  # complains

Interestingly I can do

from pynwb import TimeSeries

test = TimeSeries(name="test", data=[], unit="test", rate=1.0, resolution="None")

but that wouldn't reproduce the above error.

bendichter commented 2 years ago

nwbinspector should be able to handle files that were not written using PyNWB. If it complies with the NWB schema, we should be able to inspect it without error

CodyCBakerPhD commented 2 years ago

Ohhh dang, you're saying if you don't specify resolution in a TimeSeries via MatNWB, it won't default to -1.0? I see

bendichter commented 2 years ago

I don't really understand why test = TimeSeries(name="test", data=[], unit="test", rate=1.0, resolution=None) doesn't work

CodyCBakerPhD commented 2 years ago

I don't really understand why test = TimeSeries(name="test", data=[], unit="test", rate=1.0, resolution=None) doesn't work

Does it let you do it on your end?

When I run it on my side I get

TypeError: TimeSeries.__init__: incorrect type for 'resolution' (got 'NoneType', expected 'str or float')

since a NoneType doesn't match allowed docval cases.

bendichter commented 2 years ago

yeah it looks like there is a mismatch here between PyNWB and the schema

CodyCBakerPhD commented 2 years ago

LOL ok, I figured it out.

From

with pynwb.NWBHDF5IO(
    path="https://dandiarchive.s3.amazonaws.com/blobs/da5/107/da510761-653e-4b81-a330-9cdae4838180",
    mode="r",
    load_namespaces=True,
    driver="ros3",
) as io:
    nwbfile = io.read()

go down to

nwbfile.processing["video_files"]["video"].time_series["20170203_KIB_01_s1.1.h264"]

and observe

isinstance(a["video_files"]["video"].time_series["20170203_KIB_01_s1.1.h264"], TimeSeries)
Out[1]: True

type(a["video_files"]["video"].time_series["20170203_KIB_01_s1.1.h264"])
Out[2]: abc.NwbImageSeries

a["video_files"]["video"].time_series["20170203_KIB_01_s1.1.h264"].resolution is None
Out[3]: True

Working on a fix in our code for this, but the only way I know to write a test for this is specifically on this s3 path.