SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
493 stars 188 forks source link

splitting units #2487

Open aghatpande opened 6 months ago

aghatpande commented 6 months ago

units_2_split Please advise on how I can split these units in Spikeinterface and then replace the existing data for each unit in my current waveform extractor? Also how can I add new units to my waveform extractor without losing all the various calculations already done on them (e.g. amplitudes, quality metrics etc).

I did try fitting mix of gaussians but am having trouble with that. Would it be reasonable to split these units by eye since at least some of them are quite obvious? Thanks for your help in advance.

aghatpande commented 6 months ago

Let me explain the process prior to these plots. I am running SpikeInterface v 0.99.1 in python 3.10.13 in a Jupyter notebook.

  1. I sorted the raw data using SI running kilosort3 in a Docker container without any pre-processing (KS3 does it anyway).
  2. After the sorting, I used SI's quality metrics and filtered out clusters that were below an SNR of 5.0 an ISI violations threshold of 0.3 and a firing rate above 0.05 Hz
  3. I identified these units after plotting their amplitudes in SI GUI and noting the double peaks
  4. Following which I have done this:

    retrieving amplitude distribution using the extension

    units_need_splitting=[56,121,99,140,100,71] # manually selected with spikeinterface-gui after automated quality metrics based filtering we_split_needed = we_curated.select_units(unit_ids=units_need_splitting) amplitudes = si.compute_spike_amplitudes(we_split_needed, outputs="by_unit", load_if_exists=True)

plot amplitude distribution for each unit

fig, ax = plt.subplots(3, 2, figsize=(10, 10)) ax = ax.ravel() for i, unit_id in enumerate(units_need_splitting): amplitudes_i = amplitudes[0][unit_id] ax[i].hist(amplitudes_i, bins=50) ax[i].set_title(f"Unit {unit_id}") plt.tight_layout() plt.show()

The extensions available for the waveform extractor are: ['template_metrics', 'similarity', 'principal_components', 'spike_amplitudes', 'correlograms', 'spike_locations', 'unit_locations', 'noise_levels', 'quality_metrics'] `

alejoe91 commented 6 months ago

Hi!

We have some experimental functions for auto-splitting, but they are not fully intergrated into the main code-base. Maybe @samuelgarcia can comment more.

You could try to write your own function to get a set of spike indices belonging to different clusters in a unit.

Something along these lines:

indices_list = split_unit(waveform_extractor, unit_id_to_split)

If indices_list is a list of indices belonging to different sub-clusters, you can then use the curation.SplitUnitSorting as follows:

from spikeinterface.curation import SplitUnitSorting

sorting_split = SplitUnitSorting(sorting, unit_id=unit_id_to_split, indices_list=indices_list)

This would return a new sorting object. Once you are done with all the splits, you can re-extract waveforms.

samuelgarcia commented 6 months ago

Yes exactly. In curation we have some function for auto_merge but no function for auto split. The plan is to add this at some point!!

We have very low level for splitting but at sorting component so it do not work withwaveforms extractor at the momeent.

This could be added in the next release. Let keep this issue open, but it will be open for a while.