NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
176 stars 85 forks source link

cannot change shape of dataset once written #918

Open luiztauffer opened 5 years ago

luiztauffer commented 5 years ago

Feature Request

According to the documentation, the pynwb.epoch.TimeIntervals class accepts several data formats (ndarray or list or tuple or Dataset or AbstractDataChunkIterator or HDMFDataset), but the methods to add new intervals only works for lists.

Problem/Use Case

In an instance with invalid intervals values initiated from an HDF5 file, the methods for adding new intervals won’t work. I tried it with 3 different methods:

All raise the AttributeError: 'Dataset' object has no attribute 'append', from different locations inside the ‘core.py’ module.

Here's a reproducible example:

import pynwb
from pynwb import NWBFile, NWBHDF5IO
from datetime import datetime

# Make data source file
nwbfile = NWBFile('aa','aa', datetime.now().astimezone())

# Add invalid intervals (THIS WORKS)
nwbfile.add_invalid_time_interval(start_time=1.0, stop_time=2.0)
nwbfile.add_invalid_time_interval(start_time=5.0, stop_time=7.0)
print('Invalid times, start: ', nwbfile.invalid_times['start_time'].data)
print('Data format: ', type(nwbfile.invalid_times['start_time'].data))
print('-------------------------------------------------')

# Save data source file
with NWBHDF5IO('test_out.nwb', 'w') as io:
  io.write(nwbfile)

# New instance
nwb2 = NWBHDF5IO('test_out.nwb', 'r').read()
print('Invalid times, start: ', nwb2.invalid_times['start_time'].data[:])
print('Data format: ', type(nwb2.invalid_times['start_time'].data))

# Try to use methods to add new invalid time intervals
# All 3 methods will give errors
nwb2.add_invalid_time_interval(start_time=8.0, stop_time=12.7)
#nwb2.invalid_times.add_interval(start_time=8.0, stop_time=12.7)
#nwb2.invalid_times.columns[0].add_row(8.0)

This will raise the following

~/anaconda3/envs/ecog_gui/lib/python3.7/site-packages/pynwb/core.py in add_row(self, **kwargs)
   1127         if row_id is None:
   1128             row_id = len(self)
-> 1129         self.id.data.append(row_id)
   1130 
   1131         for colname, colnum in self.__colids.items():

AttributeError: 'Dataset' object has no attribute 'append'

It seems to me we should be testing the data format before the append() method, in different places of core.py.

Checklist

bendichter commented 5 years ago

Here is a demonstration of reshaping a dataset that has already been written. Note that maxshape must be set to (None,) for this to work.

from datetime import datetime
from dateutil.tz import tzlocal
from pynwb import NWBFile, NWBHDF5IO
from pynwb import TimeSeries
from hdmf.backends.hdf5.h5tools import H5DataIO

start_time = datetime(2017, 4, 3, 11, tzinfo=tzlocal())

nwbfile = NWBFile(session_description='demonstrate NWBFile basics',
                  identifier='NWB123',
                  session_start_time=start_time)

data = list(range(100, 200, 10))
timestamps = list(range(10))
test_ts = TimeSeries(name='test_timeseries', data=H5DataIO(data, maxshape=(None,)), unit='m', timestamps=timestamps)
nwbfile.add_acquisition(test_ts)

with NWBHDF5IO('example_file_path.nwb', 'w') as io:
    io.write(nwbfile)

with NWBHDF5IO('example_file_path.nwb', 'a') as io:
    nwb = io.read()
    nwb.acquisition['test_timeseries'].data.resize((12,))
    nwb.acquisition['test_timeseries'].data[-2:] = [-1, -2]
    print(nwb.acquisition['test_timeseries'].data[:])