int-brain-lab / ibllib

IBL core shared libraries
MIT License
61 stars 36 forks source link

label vs labels #848

Closed RobertoDF closed 2 days ago

RobertoDF commented 2 days ago

Hi, if i run spikes, clusters, channels = sl.load_spike_sorting().The cluster dict will contain 2 fields called label and labels:

Screenshot 2024-09-26 at 10 57 08

"good" clusters are supposed to be filtered with label via good_clusterIDs = clusters['cluster_id'][clusters['label'] == 1] (example 4 here).

Is this correct? what is the meaning of label and labels? and why label is populated by 0.33333, 0.66666 and 1?

In this example pid = '799d899d-c398-4e81-abaf-1ef4b02d5475'.

Thanks!

mayofaulkner commented 2 days ago

Hello,

Yes that is correct to get the good clusters you want to filter by those that have a clusters['label'] == 1.

The label value is computed as the sum of the 3 single unit metrics that we use. If the value is 0.33 it means only 1/3 of the metrics pass, if 0.66 2/3 pass and when the value is 1, 3/3 metrics pass and we consider the cluster to be good.

The labels value comes from merging the clusters and channels objects together and is an indication of whether the recording channel that the cluster was recorded on was good or bad. See here for more documentation on this dataset.

I agree having both label and labels in the same table is confusing. We will look into how we can improve this.

Let me know if you have any further questions!

RobertoDF commented 2 days ago

Amazing, thanks for the quick reply!