NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
173 stars 83 forks source link

[Bug] `NWBHDF5IO.can_read()` fails for missing version #1919

Open h-mayorquin opened 3 weeks ago

h-mayorquin commented 3 weeks ago

As in the title, the problem is the following, NWBHDF5IO.can_read() relies on get_nwbfile_version and critically it indexes the output:

https://github.com/NeurodataWithoutBorders/pynwb/blob/90e60484415d38d3d566dc9fea63b6d99ffdbe43/src/pynwb/__init__.py#L272

However, get_nwbfile_version can also return None and then the indexing fails:

https://github.com/NeurodataWithoutBorders/pynwb/blob/90e60484415d38d3d566dc9fea63b6d99ffdbe43/src/pynwb/__init__.py#L183

It seems to me that the solution should be that if get_nwbfile_version returns None then NWBHDF5IO.can_read() should return None instead of trying to index.

From the discussion in neuroconv:

https://github.com/catalystneuro/neuroconv/issues/910

CodyCBakerPhD commented 3 weeks ago

then NWBHDF5IO.can_read() should return None instead of trying to index.

I'd say a function called can_read ought to just return -> bool, and so False in the case where no version was able to be found by conventional means

h-mayorquin commented 3 weeks ago

Yeah, I think that makes sense. The only confusing semantics is that get_nwbfile_version returns None when the version can't be found. So maybe the file could be read but there is no version....

But I guess better be strict about it?

stephprince commented 3 weeks ago

From what I understand, if the version can't be found the file is corrupt, not a valid NWBFile, or nothing has been written.

I think returning can_read as False makes sense - I can make a PR to update it so that it does not error if get_nwbfile_version returns None but returns False instead.