MouseLand / Kilosort

Fast spike sorting with drift correction for up to a thousand channels
https://kilosort.readthedocs.io/en/latest/
GNU General Public License v3.0
469 stars 245 forks source link

Asking better solutions to large amplitute noise, wrongly merged clusters in Kilosort3, and better guide in paramters #381

Closed Tywang-720 closed 8 months ago

Tywang-720 commented 3 years ago

Hi there, I'm using Ks3, but I found some of my results look a bit funny.

I'm new to Kilosort and phy, and I would be very grateful if someone could look at my results and offer any advice. Like, where did I mess up, how could my results be fixed, etc.

Example 1, about messy cluster:

As I understand it, when I select a cluster in the cluster view in phy, the waveform view shows all the waveforms in that cluster.

But, the waveforms in the picture below look quite different to me, and I'm confused how to fix it by setting Kilosort parameters (like, setting more serious criteria, which parameters are more relevant to this situation?).

20210423 Sorting problem phy2 2

Large amplitude noise in the cluster:

Sometimes I found the waveforms get polluted by large amplitute noise. It could be quite severe in some clusters (and they are marked as good nonetheless).

Could Kilosort automatically remove these noises from the cluster? We got tons of data and few hands, manual curation seems to be impractical.

20210423 Sorting problem phy2 3

How to tune the parameters in Ks3:

In the above results, I didn't change much default parameters of Kilosort, except the high pass filter and batch size: ks.ops.fshigh = 300; ks.ops.NT = 16*1024+ ks.ops.ntbuff;

other parameters are like: 20210423 parameters

I'm currently wondering, how should I adjust ops.sigmaMask, ops.whiteningRange according to the design of my own probe?

Also, What is ops.lam(amplitute penalty() doing?

If ops.AUCsplit and ops.lam is no longer useful in Kilosort3 as indicated in GUI, then what parameters should I tune first to fix incorrectly merged clusters?

It would be of great help if there could be a detailed document showing the visualized results under specific parameters, to help people understand their behavior.

Thanks!

Tywang-720 commented 3 years ago

Update 20210427:

I've been reading though the issues and gaining some knowledge.

Like, the parameters "momentum","sigmamask" are less useful to adjust: https://github.com/MouseLand/Kilosort/issues/156

first component of "Th" is important for spike detection, second component is important of cluster purity. https://github.com/MouseLand/Kilosort/issues/59

But I still have no progress in improving my results.

I found more examples of unacceptable, weird results:

Example 20210427-01: dirty cluster (Th[10 4],lam 5, AUCsplit 0.8, batch size 0.25s)

20210427 dirty cluster

This is example is different from the example above, because that one takes different waveforms in different channels to be from one unit, but this one cannot even found a pure waveform cluster, waveforms are heavily polluted and they are still marked as good.

Example 20210427-02: spike waveforms get truncated(wrong spike detection time) (Th[10 4],lam 5, AUCsplit 0.8, batch size 0.25s) 20210427 wrong spk detection time

Tywang-720 commented 3 years ago

Summary by now:

I've run into several cases that needs to be fixed. I wish to know what could be done for each of these situations, respectively.

  1. Clearly different waveforms in different channels are marked as same unit.
  2. Waveform clusters are polluted by large amplitude noise. We need to remove them because they are locked to the motion of animal.
  3. Waveform clusters are unclean, with different shapes merged.
  4. Incorrect spike detection time-window, Spikes are truncated at wrong timepoint.

sometimes the results can be very funny as the picture below. Maybe the noise is intefering with calculation of template. Example 20210427-03: weirdly truncated spikes (Th[10 4],lam 5, AUCsplit 0.8, batch size 0.25s) 20210427 silly shapes

Two more questions about parameters:

  1. Is there any unwelcoming side-effect if we decrease batch size to a value too low? I'm using 0.5s/0.25s batch size, and I fear I have to further decrease this value for some larger files. What is the ideal range of batch size? Do I need to adjust parameters like "nskip","nskipCov" alongside the batch size?
  2. The parameter ops.minFR says "minimum spike rate (Hz), if a cluster falls below this for too long it gets removed". How long exactly is the time?
Tywang-720 commented 3 years ago

Example 20210427-04: dirty cluster

compared to the example in 20210427-01, I increased lam to 50. Th[10 4],lam 50, AUCsplit 0.8, batch size 0.25s

I felt that in general, bigger lam seems to have a positive effect, with more truly good clusters. but, for some clusters things could get worse with a bigger lam. The picture shows same cluster in channel 192. It's worse than what in lam 5. I also noticed that, some good spikes in the same channels are marked as mua, though their waveforms are much better than the "good" ones.

worse 192 192 mua

kingsEffy commented 8 months ago

hey @Tywang-720 I'm having same issues here, what is your ultimate solutions?