Closed MarinManuel closed 9 hours ago
What file are you pointing to in the params.py
for the export_to_phy
. in your script I don't see any filtering or referencing. KS2.5 saves a .dat
file that has been filtered and referenced. Could you try to filter and reference your recording before making your sorting analyzer?
Hi Zach, You're right, I forgot to include part of the code
rec = si.concatenate_recordings(recs)
rec_NPX = rec
rec_NPX = si.phase_shift(rec_NPX)
rec_NPX = si.bandpass_filter(rec_NPX, freq_min=300, freq_max=6000)
rec_NPX = si.common_reference(rec_NPX, operator="median", reference="global")
bad_channel_ids, channel_labels = si.detect_bad_channels(rec_NPX)
rec_NPX = rec_NPX.remove_channels(bad_channel_ids)
target_file = scratch_folder / 'all_sessions_probeA_pre_processed.zarr'
if target_file.exists():
rec_NPX = si.read_zarr(target_file)
else:
rec_NPX = rec_NPX.save(format='zarr', folder=target_file, verbose=True, **job_kwargs)
The pre-processed file is saved as zarr
and then passed to the script above to run the sorter.
I open phy using:
phy template-gui all_sessions_probeA_pre_processed_kilosort2_5_sorting/sorter_output/params.py
for the version created by kilosort2_5
and
for the version exported from the SortingAnalyzer
phy template-gui all_sessions_probeA_pre_processed_kilosort2_5_phy/params.py
this is the content of params.py from kilosort2.5:
dat_path = '/scratch2/workspace/mmanuel_uri_edu-2024-11-13/all_sessions_probeA_pre_processed_kilosort2_5_sorting/sorter_output/temp_wh.dat'
n_channels_dat = 384
dtype = 'int16'
offset = 0
sample_rate = 30000.
hp_filtered = True
from export_to_phy
:
dat_path = r'/scratch2/workspace/mmanuel_uri_edu-2024-11-13/all_sessions_probeA_pre_processed_kilosort2_5_phy/recording.dat'
n_channels_dat = 384
dtype = 'int16'
offset = 0
sample_rate = 30000.0
hp_filtered = True
Hi @MarinManuel
Can you check which Kilosort version you're using?
You should make sure you checkout v2.5.2 and v3.0.2. In the *.0.1 versions, there was a misalignment issue that got fixed.
Hi Alessio, I confirm that I am using v2.5.2 from https://github.com/MouseLand/Kilosort/releases/tag/v2.5.2
I'm pretty sure I've got the same problem using Kilosort4. The output from the SortingAnalyzer's export_to_phy
is full of noisy units
I'm sure I should get much better than that, although I cannot open the params.py
from the sorter_output
folder because of this error https://github.com/cortex-lab/phy/issues/1273 . What is the easiest way to convert my pre-processed zarr
file into a format that can be read by phy?
@MarinManuel do the templates look correct when you use the sw.plot_unit_templates
function?
I guess maybe I misunderstand how the SortingAnalyzer is supposed to be used. I had assumed the unit ids would stay the same, but that the analyzer would compute a number of other metrics, but maybe that's incorrect?
Here is unit 2 straight from Kilosort 2.5 (not a great one)
Here is unit 2 from the SortingAnalyzer's export_to_phy
and here are the templates using sw.plot_unit_templates(analyzer, unit_ids=[2])
which look like the ones shown in phy
I also looked at the units with the largest number of spikes. This is unit 159, with 475745 spikes (the second largest, the first one was noise)
But when I open the SortingAnalyzer version, the unit with the second to largest number of units has 475733 spikes and is number 138
but the template that seem to match using sw.plot_unit_templates
seem to be number 159
Ah the problem is that Phy needs linearly increasing integers as ids..you can see there is a "si unit id" column in Phy that corresponds to the unit id in spikeinterface :)
Ok, I see that unit 139 in phy has number 159 in the column si_unit_id
so that makes sense. I still don't understand why the number of spikes, the traceview and the templates are different between the two versions
Did you do any curation step? Like removing duplicated spikes??
hmmm yes, my script does
sorting = sorting.remove_empty_units()
sorting = si.remove_duplicated_spikes(sorting, censored_period_ms=0.3, method="keep_first_iterative")
sorting = si.remove_excess_spikes(sorting=sorting, recording=rec)
sorting = si.remove_redundant_units(sorting, align=False, remove_strategy="max_spikes")
before creating the SortingAnalyzer, so that makes sense, thanks.
What about the TraceView? Here are the first 250ms of the file, which looks nothing alike after the SortingAnalyzer
The sorting analyzer will use the preprocessed recording, while kilosort uses the original dat file (without any preprocessing). That should explain the difference!
I think I understand what you mean, but where do all those units come from after the SortingAnalyzer? if they were not there in the output of Kilosort? Or is it just that the templates are present over a larger number of channels after SortingAnalyzer, so there seem to be more of them?
you don't have more units :) Just the IDs might be different.
You can check as follows:
print(len(sorting_analyzer.unit_ids))
# read the kilosort folder
sorting_ks = se.read_kilosort("path-to/sorter_output")
print(len(sorting_ks.unit_ids))
Thank you Alessio, I think I am starting to realize where my assumptions were wrong. You have to admit, though, that the TimeViews look very different. I mean in terms of the colored spikes which appear much denser after the SortingAnalyzer
It's very hard to say from the plot, but yeah the trace views should look different ;) Can we close the issue?
Hi, I couldn't track the issue that I am having, hopefully someone else could have a better idea of what I am doing wrong.
I run kilosort2.5 on some NPX data (recorded with Open ePhys). My workflow is:
I noticed that when I open the phy folder created from the sortingAnalyzer, the units appear mostly (though not exclusively) like noise. However, opening phy from the
sorter_output
folder created when running kilosort, the units are overwhelming good quality.See screenshot below: The left window is from opening the
sorter_output
directly, with the right window is opening the version exported by sortingAnalyzer. Notice how different the traces look, and the waveforms, and the PCAs. The only things that are not affect seem to be autocorrelogram, ISIview and FRView.I'm attaching the script I am using. run_sorter_slurm.txt