NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
175 stars 85 forks source link

[Bug]: How to use `set_dataio`? #1640

Open CodyCBakerPhD opened 1 year ago

CodyCBakerPhD commented 1 year ago

What happened?

Trying to use the set_dataio method on a VectorData object to compress data that has been added by row

Using the trials table as a simple example, see code for my failed attempts based off of looking at https://github.com/hdmf-dev/hdmf/blob/2a42e4194853b856fe8b13a121d326b3d3d91de6/tests/unit/utils_test/test_core_DataIO.py#L38

How should this be done?

Steps to Reproduce

import numpy as np
from pynwb.testing.mock.file import mock_NWBFile
from pynwb import H5DataIO

nwbfile = mock_NWBFile()

for start_time, stop_time in zip([1.1, 2.2, 3.3], [1.7, 2.4, 5.6]):
    nwbfile.add_trial(start_time=start_time, stop_time=stop_time)

# Gives "InvalidDataIOError: Cannot get attribute 'dtype' of data. Data is not valid."
#  nwbfile.trials.start_time.set_dataio(H5DataIO())

# Gives "ValueError: Must specify 'dtype' and 'shape' if not specifying 'data'"
# nwbfile.trials.start_time.set_dataio(H5DataIO(dtype=np.float64))

# Gives "ValueError: Setting data when dtype and shape are not None is not supported"
# nwbfile.trials.start_time.set_dataio(H5DataIO(dtype=np.float64, shape=(3,)))

Traceback

No response

Operating System

Windows

Python Executable

Python

Python Version

3.9

Package Versions

No response

Code of Conduct

oruebel commented 1 year ago

Looking at the code for set_dataio and DataIO, I think the two functions have diverged a bit, in that set_dataio expected a DataIO object without data and then sets the data, whereas DataIO now checks that data or dtype/shape are set. For this to work, I think we'll need to update DataIO to have an option to allow deferred setting of data Currently, I think you would need to:

nwbfile.trials.start_time.set_dataio(H5DataIO(data=nwbfile.trials.start_time.data))

which kind of defeats the purpose of set_dataio

https://github.com/hdmf-dev/hdmf/blob/615538a0c32f7a871ef4abc91e0e8f732a1cf488/src/hdmf/container.py#L531-L538

https://github.com/hdmf-dev/hdmf/blob/615538a0c32f7a871ef4abc91e0e8f732a1cf488/src/hdmf/data_utils.py#L966-L978