SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
493 stars 188 forks source link

is is_filtered a manual param? #3244

Open zm711 opened 1 month ago

zm711 commented 1 month ago

Discussed in https://github.com/SpikeInterface/spikeinterface/discussions/3228

Originally posted by **carlacodes** July 18, 2024 Maybe I am missing something here. I am preprocessing and filtering input neuropixels data, yet in my resulting spikeinterface_recording.json file, all the recordings stored within the recording list are: `` "annotations": { "is_filtered": false,`` Is the is_filtered parameter a manual parameter that I need to change? Asking as when I then use the waveform extractor in post processing I run into an error where is_filtered needs to be set to True. example functions below: ``` import spikeinterface.full as si import spikeinterface.extractors as se import spikeinterface.preprocessing as spre import spikeinterface.sorters as ss import spikeinterface.core as sc def spikeglx_preprocessing(recording): recording = si.bandpass_filter(recording, freq_min=300, freq_max=6000) bad_channel_ids, channel_labels = si.detect_bad_channels(recording) recording = recording.remove_channels(bad_channel_ids) recording = si.phase_shift(recording) recording = si.common_reference(recording, reference='global', operator='median') return recording def spikesorting_pipeline(recording, working_directory, sorter='kilosort4'): # working_directory = Path(output_folder) / 'tempDir' / recording.name if (working_directory / 'binary.json').exists(): recording = si.load_extractor(working_directory) else: recording = spikeglx_preprocessing(recording) job_kwargs = dict(n_jobs=-1, chunk_duration='1s', progress_bar=True) recording = recording.save(folder = working_directory, format='binary', **job_kwargs) sorting = si.run_sorter( sorter_name=sorter, recording=recording, output_folder = working_directory / f'{sorter}_output', verbose=True, ) return sorting ``` And below is where I extract the waveforms: ``` def spikesorting_postprocessing(sorting, output_folder, datadir): output_folder.mkdir(exist_ok=True, parents=True) rec = sorting._recording outDir = output_folder/ sorting.name jobs_kwargs = dict(n_jobs=-1, chunk_duration='1s', progress_bar=True) sorting = si.remove_duplicated_spikes(sorting, censored_period_ms=2) if (outDir / 'waveforms_folder').exists(): we = si.load_waveforms( outDir / 'waveforms_folder', sorting=sorting, with_recording=True, ) else: we = si.extract_waveforms(rec, sorting, outDir / 'waveforms_folder', overwrite=False, ms_before=2, ms_after=3., max_spikes_per_unit=500, sparse=True, num_spikes_for_sparsity=100, method="radius", radius_um=40, **jobs_kwargs, ) ```
zm711 commented 1 month ago

Hey @carlacodes,

I figured out the problem. We weren't propagating the is_filtered parameter to the save function during writing of the binary. I've patched this code, but we have had an API change from the waveform extractor to the sorting analyzer. We think the change is better (and faster). Would you be willing to update to version 0.101.0? The other trick you would have to do is just add the is_filtered annotation to your recording if you want to stay on the waveform extractor. So for your recording you would do

recording.annotate(is_filtered=True)

before trying to do the waveform extraction.

carlacodes commented 1 month ago

Hi @zm711, thanks very much for your comment; honestly, I will update to version 0.101.0 after this batch of spike sorting. Right now, I am using 0.100.8 and kilosort 4.12, which seems to be working, and I don't want to switch as I know the kilosort developers (frustratingly) don't want to consider spike interface in their development of the package

zm711 commented 1 month ago

Totally understand! And yep we are trying to keep up with KS4 changes so it can be a bit tricky to stay in sync! Good luck Carla!