csn-le / wave_clus

A fast and unsupervised algorithm for spike detection and sorting using wavelets and super-paramagnetic clustering
124 stars 65 forks source link

Clustering #204

Closed Jay-Shiralkar closed 2 years ago

Jay-Shiralkar commented 2 years ago

Hi Fernando, When I receive the results from Do_clustering, can I specify the constant number of clusters in which I would like to get the spikes classified ? Also, while processing two different data sets, will the cluster 1 from data1 and cluster 1 from data2 have any relation ? (Like are they same ?) In general, how can compare the clustering results from two different data sets ?

Thanks, Jay

ferchaure commented 2 years ago

Hi

When I receive the results from Do_clustering, can I specify the constant number of clusters in which I would like to get the spikes classified ?

Not really, we had that option before but, is tricky because you have to choose which of the current clusters to keep. Choosing the ones with more spikes is not always the best idea.

Also, while processing two different data sets, will the cluster 1 from data1 and cluster 1 from data2 have any relation ? (Like are they same ?) In general, how can compare the clustering results from two different data sets ?

Using just the class numbers, you can't compare them, usually class 1 is multiunit activity that the only relationship. You may concatenate both recordings and then separate them again using the spikes times. Another alternative is comparing pairs of mean waveforms, and assign a class with the closest one in the other data set, but you will encounter a lot of annoying details

Jay-Shiralkar commented 2 years ago

Hi, Thanks for the clarification. Also, is there any way that I can look at the which cluster occurs at what specific time in my data ? For example, can I look for whether spike bursts at 200 seconds are associated with the cluster 1 ? In the times file, I see that there's a variable named "cluster class" which contains two columns, where first column is for the cluster number of the spike. Is the second column for the time (in milliseconds) at which the spike occurs in the data ? Thanks.

ferchaure commented 2 years ago

You are right about the "cluster class" variable. If a cluster is highly concentrated in a specific time, you could try doing something like: boxplot(cluster_class(:,2),cluster_class(:,1),'Orientation','horizontal')

I use that to check for overclustering due to drifting, but it could work your you.

Jay-Shiralkar commented 2 years ago

Ohh, great !!! That's exactly I was looking for. Thanks.