NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
175 stars 85 forks source link

validation error checking shape #1084

Open bendichter opened 4 years ago

bendichter commented 4 years ago

Description

For this file, validation throws error and it's not clear what is wrong with the file.

Steps to Reproduce

from pynwb import NWBHDF5IO, validate                                                                                                                                     
validate(NWBHDF5IO('anm369962_2017-03-09_0.nwb', 'r'))                                         
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-12cadade6c79> in <module>
----> 1 validate(NWBHDF5IO('anm369962_2017-03-09_0.nwb','r'))

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    451                         raise_from(ExceptionType(msg), None)
    452 
--> 453                 return func(**parsed['args'])
    454         _rtype = rtype
    455         if isinstance(rtype, type):

~/dev/pynwb/src/pynwb/__init__.py in validate(**kwargs)
    177     builder = io.read_builder()
    178     validator = ValidatorMap(io.manager.namespace_catalog.get_namespace(name=namespace))
--> 179     return validator.validate(builder)
    180 
    181 

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    236             raise ValueError(msg)
    237         validator = self.get_validator(dt)
--> 238         return validator.validate(builder)
    239 
    240 

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    474                         ret.append(MissingError(self.get_spec_loc(spec), location=self.get_builder_loc(builder)))
    475                 else:
--> 476                     ret.extend(validator.validate(sub_builder))
    477 
    478         return ret

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    442                                 ret.append(IllegalLinkError(self.get_spec_loc(inc_spec),
    443                                                             location=self.get_builder_loc(tmp)))
--> 444                         ret.extend(sub_val.validate(tmp))
    445                         found = True
    446             if not found and self.__include_dts[dt].required:

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    442                                 ret.append(IllegalLinkError(self.get_spec_loc(inc_spec),
    443                                                             location=self.get_builder_loc(tmp)))
--> 444                         ret.extend(sub_val.validate(tmp))
    445                         found = True
    446             if not found and self.__include_dts[dt].required:

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    442                                 ret.append(IllegalLinkError(self.get_spec_loc(inc_spec),
    443                                                             location=self.get_builder_loc(tmp)))
--> 444                         ret.extend(sub_val.validate(tmp))
    445                         found = True
    446             if not found and self.__include_dts[dt].required:

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/validate/validator.py in validate(self, **kwargs)
    375                 ret.append(DtypeError(self.get_spec_loc(self.spec), self.spec.dtype, dtype,
    376                                       location=self.get_builder_loc(builder)))
--> 377         shape = get_shape(data)
    378         if not check_shape(self.spec.shape, shape):
    379             if shape is None:

~/dev/hdmf/src/hdmf/data_utils.py in get_shape(data)
     33         return None
     34     elif hasattr(data, '__len__') and not isinstance(data, (text_type, binary_type)):
---> 35         return __get_shape_helper(data)
     36     else:
     37         return None

~/dev/hdmf/src/hdmf/data_utils.py in __get_shape_helper(data)
     18     if hasattr(data, '__len__'):
     19         shape.append(len(data))
---> 20         if len(data) and not isinstance(data[0], (text_type, binary_type)):
     21             shape.extend(__get_shape_helper(data[0]))
     22     return tuple(shape)

~/dev/hdmf/src/hdmf/backends/hdf5/h5_utils.py in __getitem__(self, arg)
    108             return [self.io.get_container(self.dataset.file[x]) for x in ref]
    109         else:
--> 110             return self.io.get_container(self.dataset.file[ref])
    111 
    112     @property

~/dev/hdmf/src/hdmf/utils.py in func_call(*args, **kwargs)
    436                         raise_from(ExceptionType(msg), None)
    437 
--> 438                 return func(self, **parsed['args'])
    439         else:
    440             def func_call(*args, **kwargs):

~/dev/hdmf/src/hdmf/backends/hdf5/h5tools.py in get_container(self, **kwargs)
    329         if builder is None:
    330             msg = '%s:%s has not been built' % (fpath, path)
--> 331             raise ValueError(msg)
    332         container = self.manager.construct(builder)
    333         return container

ValueError: /Volumes/GoogleDrive/My Drive/Exported NWB 2.0/Economo 2018/exported_nwb2.0/anm369962_2017-03-09_0.nwb:/ has not been built

Environment

pynwb                   1.1.0.post0.dev2            /Users/bendichter/dev/pynwb/src                  
hdmf                    1.3.2.post0.dev8            /Users/bendichter/dev/hdmf/src

Checklist

rly commented 4 years ago

@bendichter I got as far as figuring out that in /processing/ecephys/ the PSTH table has a column trial_id that consists of object references to the root of the file. When validating the shape of the data, the object reference pointer cannot resolve because the root of the file has not yet been built. Why are the object references pointing to the root of the file?

bendichter commented 4 years ago

Thanks for hunting that down, @rly. That's probably a mistake- I'll bring that up to DJ. But let's also make our validation program robust enough to say which dataset is causing the problem.

dsleiter commented 3 years ago

@rly @bendichter I started looking into this issue a bit, but the linked file is either no longer available for download or not available for viewers to download.

But let's also make our validation program robust enough to say which dataset is causing the problem.

  1. Is this still a relevant improvement to make?
  2. If so, do either of you have access to this file so that I can test the issue on the most recent pywnb and hdmf versions, and verify that any implementation resolves the issue?