NeurodataWithoutBorders / pynwb

A Python API for working with Neurodata stored in the NWB Format
https://pynwb.readthedocs.io
Other
178 stars 84 forks source link

Make convenience functions for trialized time series #973

Open rly opened 5 years ago

rly commented 5 years ago

We want to encourage users with trialized time series (e.g. segments of a continuous time series) to store their data as a single, continuous (untrialized) time series, albeit with segments of missing data. They would use timestamps to mark times with data.

To facilitate that, I suggest adding a convenience function to create a TimeSeries that takes a matrix of electrodes x trials x samples and a vector of start times for each trial and does the un-trialization under the hood.

See also the reverse: https://github.com/NeurodataWithoutBorders/pynwb/issues/832

bendichter commented 5 years ago

Can you give an example of what the syntax for this might be?

rly commented 5 years ago
# 2 electrodes x 3 trials x 5 samples
imported_trialized_data = np.array((((1,2,3,4,5), (2,3,4,5,6), (1,5,10,12,14)),
                                    ((10,20,30,40,50), (20,30,40,50,60), (10,50,100,120,140))))

# for the case of a fixed sampling rate:
trial_start_times = np.array((2.1, 10.4, 19.5))

ephys_ts = TimeSeries.createFromTrials(name='test_ephys_data',
                            trialized_data=imported_trialized_data,
                            trial_start_times=trial_start_times,
                            rate=rate,  
                            electrode_table_region)

# alternatively, if they have a matrix of timestamps for each trial
ephys_ts = TimeSeries.createFromTrials(name='test_ephys_data',
                            trialized_data=imported_trialized_data,
                            timestamps=trial_timestamps,
                            electrode_table_region)

This basically just saves the user the hassle of doing reshape on their data and, if they have a fixed sampling rate, this also saves the user the hassle of generating vectors of timestamps per trial to input into the TimeSeries constructor.

rly commented 5 years ago

A couple of the hackathon participants were lamenting the fact that they had to untrialize their data, which is an error-prone process, only for NWB to re-trialize their data for their analyses. This tries to make the process easier.

bendichter commented 5 years ago

I see. I don't think going in that direction will be as common but if you think it will be useful then sure why not.

rly commented 5 years ago

Yeah, it's not a priority, and certainly we want to encourage people to input all of their data into NWB, including outside trial boundaries. This just came to mind when you wrote the best practices text about what users should do when they have only the trialized data.