NeurodataWithoutBorders / nwb-schema

Data format specification schema for the NWB neurophysiology data format
http://nwb-schema.readthedocs.io
Other
52 stars 16 forks source link

Update `TimeIntervals` to use the new `TimeSeriesReferenceVectorData` #484

Closed rly closed 2 years ago

rly commented 3 years ago

NWB 2.4.0 introduces the TimeSeriesReferenceVectorData neurodata type which is identical to the VectorData-typed column named "timeseries" in the TimeIntervals table. For consistency in the data format and API, the TimeIntervals neurodata type should be updated to use the TimeSeriesReferenceVectorData neurodata type for the column named "timeseries".

We will need to decide for the PyNWB API how to handle reading of old files written using the VectorData form of the "timeseries" column. Options: 1) Leave the column as an object of type VectorData. 2) Read the column data into an object of type TimeSeriesReferenceVectorData with the same object ID. This will likely require a change in the ObjectMapper base code to allow overriding the type-to-class mapping or modifying the builder before constructing the new object associated with the "timeseries" column.

ref: https://github.com/NeurodataWithoutBorders/nwb-schema/pull/483

oruebel commented 3 years ago

Just adding another option:

  1. Read as VectorData for NWB 2 - 2.3 files and read as TimeSeriesReferenceVectorData for NWB >=2.4 files. Optionally add a function TimeSeriesReferenceVectorData.from_vectordata to convert a VectorData column to a TimeSeriesReferenceVectorData if a user needs the new functionality. Similarly, we could have a TimeSeriesReference.from_tupes to convert the results returned from VectorData to a list of reference tuples. Optionally, there could then also be some corresponding function on TimeIntervals to update the type of the TimeIntervals['timeseries'] column if necessary. An advantage of this behavior is that it is explicit for the user and in most cases is probably not necessary anyways. The downside is that it's up to the user if they want to use new features for older files.
oruebel commented 3 years ago

For the upcoming release my current thinking is that we should probably go with "Option 1" and leave TimeIntervals as is and then decide on an option to update the type after the release.

rly commented 3 years ago

Yes, I agree with Option 1 for the upcoming 2.4.0 release. We can still implement options 2 and 3 later on.

oruebel commented 3 years ago

https://github.com/NeurodataWithoutBorders/pynwb/pull/1390 defines another alternative approach (option 4). Instead attempting to fix the issue in the ObjectMapper this approach updates the Container class for TimeIntervals.timeseries after the fact to change it inplace from a VectorData to a TimeSeriesReferenceVectorData