Closed plodocus closed 4 years ago
I believe this discrepancy results from the fact that KS1 doesn't do any merging of units, so the cluster IDs are always consecutive. However, after manual merging/splitting in phy, you'll get new cluster IDs that are larger than the first dimension of the pc_feature_ind
matrix, hence the indexing error.
pc_feature_ind.npy
and templates.npy
are actually indexed by spike_templates.npy
. After manual refinement there may not be a 1:1 relationship between the cluster ID and the template ID, which is what this code assumes.
Is this something you'd like to work on fixing? Either way, I will take a look at it when I'm back in lab next week.
Yes, I can try to fix it.
Well, I got it to run without an error but all I get are nans. I think I don't entirely understand what exactly happens in the code and what some of KS's output represents, so it's best if someone else works on this.
One potential difficulty if the indexing is changed to use the spike_templates
is that there might be multiple templates for merged clusters and shared templates for split clusters.
@jsiegle Did you have a chance to take a look at this?
I made a little bit of progress...I think the code in the ks1-compatibility
branch is almost working, except for the PC feature extraction step. The functions that need to be updated are on lines 280-285 of metrics.py
:
channel_mask = make_channel_mask(cluster_id2, pc_feature_ind, channels_to_use)
subsample = int(relative_counts[idx2])
index_mask = make_index_mask(spike_clusters, cluster_id2, min_num = 0, max_num = subsample)
pcs = get_unit_pcs(pc_features, index_mask, channel_mask)
The tricky part is that the relevant rows in the PC matrix depend on the spike_template
ID for each spike, and so you can't simply use the intersection of the channel_mask
and index_mask
. I think there needs to be a for
loop that checks the template IDs before pulling out the PCs. I'll try to write this next week, unless you have a chance to look at it before then.
Hi @DanBenHa, I just pushed some code to the ks1-compatibility
branch that should fix the problem in principle, but which isn't working with the particular KS1-sorted data that I have. For some reason, the values in the pc_feature_ind
matrix jump all over the place, which makes it almost impossible to find consecutive matching channels across templates.
Can you test what I've written on your datasets? If it doesn't work, you can send me the post-curation Kilosort outputs for one session, and I can test it here.
Hi @jsiegle,
apologies for the delayed response. I tried the code in the ks1-compatibility
branch, but encountered this error:
ecephys spike sorting: quality metrics module
Loading data...
Calculating PC-based metrics
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/local/Cellar/python/3.7.6_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/Users/agjacob06/Documents/Git/ecephys_spike_sorting/ecephys_spike_sorting/modules/quality_metrics/__main__.py", line 77, in <module>
main()
File "/Users/agjacob06/Documents/Git/ecephys_spike_sorting/ecephys_spike_sorting/modules/quality_metrics/__main__.py", line 67, in main
output = calculate_quality_metrics(mod.args)
File "/Users/agjacob06/Documents/Git/ecephys_spike_sorting/ecephys_spike_sorting/modules/quality_metrics/__main__.py", line 30, in calculate_quality_metrics
metrics = calculate_metrics(spike_times, spike_clusters, spike_templates, amplitudes, channel_map, pc_features, pc_feature_ind, args['quality_metrics_params'])
File "/Users/agjacob06/Documents/Git/ecephys_spike_sorting/ecephys_spike_sorting/modules/quality_metrics/metrics.py", line 75, in calculate_metrics
params['n_neighbors'])
File "/Users/agjacob06/Documents/Git/ecephys_spike_sorting/ecephys_spike_sorting/modules/quality_metrics/metrics.py", line 243, in calculate_pc_metrics
cluster_peak_channels[idx] = np.median(template_peak_channels[templates_for_unit])
IndexError: index 488 is out of bounds for axis 0 with size 487
Could you provide me with an email address to which I could send a link to the Kilosort output?
P.S.: I realised that this toolbox was written for Neuropixels probes. My data was acquired with Neuronexus probes. Is that a problem for the quality metrics code?
It should work for Neuronexus, but the code may contain some Neuropixels-specific assumptions that are causing problems. If you can send me the data, it will be very helpful for debugging. My email address is joshs [at] alleninstitute [dot] org
The latest changes I merged into the master branch should fix this issue. Please let me know if you're still having problems!
Hello,
I'm trying to run the quality metrics module on data that was sorted with Kilosort 1 (about 2 years ago if that matters). Unfortunately, computation stops at PC-based metrics:
cluster_ids
has non-consecutive numbers which seems to be the cause of the problem. I'm a bit confused thatenumerate
is used as the generator but the variableidx
is not used as the index. Couldcluster_id
be switched withidx
?