Adding time vector to `generate_ground_truth_recording()`

SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.

https://spikeinterface.readthedocs.io

MIT License

527 stars 187 forks source link

Adding time vector to `generate_ground_truth_recording()` #3071

Closed JoeZiminski closed 4 months ago

JoeZiminski commented 4 months ago

In some places it would be cool to test that certain processing steps are respecting the user-set times on a recording object (e.g. with set_times()). Ideally it would be nice to do this through generating test recordings with generate_ground_truth_recording() but as far as I can tell this is not possible. set_times() can be called on the generated recording object, but the spike times in the sorting object would then be wrong, and there is no set_times() on the sorting object (which makes sense).

Would it be okay to add this functionality to generate_ground_truth_recording()? (e.g. a set_times=[(t_start or time_array), (one for each segment)]). This would add times to the generated recording and then make sure the sorting is generated with respect to these timings.

Happy to open a PR but first wanted to check this is okay / there are any reasons not to do this?

h-mayorquin commented 4 months ago

In what situations do you need it for generate_ground_truth that are not convered by generate_recording that does not have an associated sorting?

samuelgarcia commented 4 months ago

At the moment there is no time handling in sorting object unless there is a link to the recording. I would like to change this and hanle time in base class but this is a quite big project and we should discuss the design a lot before any coding.

JoeZiminski commented 4 months ago

Thanks both, @h-mayorquin the use-case is to test the times in a Sorting object (e.g. in some postprocessor) when there is a t_start or time_vector applied to the recording. At the moment if you use generate_recording (which, I believe does not actually include spikes only noise, but for now assuming it has spikes) then to get a sorting object you would need to actually sort the recording first. It would be nice to just use the synthetic sorting object, but have it aligned in time to the associated recording objects t_start or time_vector.

@samuelgarcia that's okay for my use case for the sorting object not to expose these directly. But, maybe my suggestion is overkill. Is it possible to have a recording-sorting pair, then update the times on the recording, and propagate it to the sorting? Otherwise, for my use-case it would be necessary when generate_ground_truth_recording to first apply some time information to the recording and then make the synthetic sorting incorporate these. Hope all this makes sense!

h-mayorquin commented 4 months ago

Thanks both, @h-mayorquin the use-case is to test the times in a Sorting object (e.g. in some postprocessor) when there is a t_start or time_vector applied to the recording. At the moment if you use generate_recording (which, I believe does not actually include spikes only noise, but for now assuming it has spikes) then to get a sorting object you would need to actually sort the recording first. It would be nice to just use the synthetic sorting object, but have it aligned in time to the associated recording objects t_start or time_vector.

Thanks for explaining. So you actually want to test how the times in the sorter are accessed as well and of course they should somehow match the sorting, right?

JoeZiminski commented 4 months ago

Cheers the use-case I originally had in mind was for the correlogram. Here I think only the spike index and fs is used to calculate time difference between spikes. I was wondering what would happen if there was a time_vector attached, if correlogram calculation would incorporate the times from this to calculate the spike time differences or instead it would just rely on the fs. TBH I don't think it would make much difference, I guess the use of time_vector is to track very small drifts in time over long periods and should not really affect the correlogram. Nonetheless it would be cool to feed it a sorting based on a recording with a time vector just check what happens! I wonder if there are other postprocessing cases where things like the t_start across segments or the time_vector may be important.

JoeZiminski commented 4 months ago

This will also be useful for the docs examples, I'll take a stab at this next week and see how it looks, hopefully it will not add much complexity to the generators.

JoeZiminski commented 4 months ago

In the end it is already possible! awesome, simply


import spikeinterface.full as si

recording, sorting = si.generate_ground_truth_recording(
    durations=[30.0],
)

recording.set_times(times=1000)
sorting.register_recording(recording)

sorting.get_unit_spike_train(0, return_times=True)  # check

h-mayorquin commented 4 months ago

Great