MouseLand / Kilosort

Fast spike sorting with drift correction for up to a thousand channels
https://kilosort.readthedocs.io/en/latest/
GNU General Public License v3.0
452 stars 237 forks source link

Bimodal amplitude distribution, nt and under-merging #742

Closed Tingchen-G closed 3 weeks ago

Tingchen-G commented 1 month ago

Hi,

First of all, thank you so much for all your hard work! Kilosort4 is working really well for us. However, I am encountering some issues and am hoping to seek your advice:

  1. We observe bimodal amplitude distribution for some units. I would thus expect to see a smaller peak in the waveform view, but it does not seem to be there. Any suggestion on why this might be the case? Or are there other interpretations for the bimodal distribution? phy_screenshot_20240722182757_AmplitudeView

phy_screenshot_20240722183414_WaveformView

phy_screenshot_20240722183114_WaveformView

  1. The default nt=61 works for our data, but since our sampling frequency is 20kHz, I'm hoping to use nt=41 or even lower values. This however would lead to CUDA runs out of memory error. Reducing the number of samples does fix this, but I'm still not sure why a lower nt value would take up more memory?

  2. There are some units that we believe should be merged, like the ones shown below. What parameters could I change to encourage this merging? I tried increasing 'ccg_threshold', but that does not seem to do the trick. Screenshot 2024-07-22 at 19 03 57

Could I ask for some advice on these issues? Thank you!

jacobpennington commented 1 month ago

Hello,

  1. That is the correct interpretation. You may be able to see a smaller peak if you draw a lasso around the points for one of the peaks (ctrl left click to add vertices) and then "edit > split" to make two clusters. Either way, if you're only seeing something like this rarely it's not a big concern, errors like that can happen.

  2. Unclear why reducing nt would increase memory usage, but you also don't need to reduce it. Sorting should work fine with nt = 61, there will just be some extra zero (or close to 0) values at either end of the waveform.

  3. If you see this happening a lot, it's possible some parameter changes could be needed. If it's just a few clusters, those same parameter changes could mess up other split/merge decisions, so it's best to leave it as-is unless you notice a consistent trend. It's not clear why that particular pair wasn't merged.

Tingchen-G commented 1 month ago

Hi,

Thank you for your reply! For the bimodal amplitude distribution, we tried splitting into two clusters, but still the smaller peak is not visible in the waveform view, despite the amplitude view showing great difference. The waveforms of the two clusters even completely overlap in templates high-pass. So we are still not quite sure what could be causing this?

Screenshot 2024-07-30 at 16 40 01

Screenshot 2024-07-30 at 15 41 17

jacobpennington commented 1 month ago

@Tingchen-G Sorry for the delay. To help debug this, can you please share a screenshot of the autocorrelogram for that unit before splitting, and/or the cross-correlograms after splitting? Also, can you please clarify if you're seeing this happen often or if it's just one or two units that you noticed?

Tingchen-G commented 1 month ago

@Tingchen-G Sorry for the delay. To help debug this, can you please share a screenshot of the autocorrelogram for that unit before splitting, and/or the cross-correlograms after splitting? Also, can you please clarify if you're seeing this happen often or if it's just one or two units that you noticed?

Sure! Here are the ACG and CCG screenshots. this happens for 2~3 units in a recording, but many recordings we analysed has this issue.

phy_screenshot_20240808145328_CorrelogramView phy_screenshot_20240808145318_CorrelogramView

jacobpennington commented 1 month ago

Thanks. Based on the correlograms, it looks like this is just a case of Kilosort learning the residual of one template as a separate template, since they're so strongly correlated. Some quirk with scaling for computing amplitudes is probably causing the smaller values. Ultimately if this is only happening for a couple of units it's really not a problem. The smaller amplitude peak is only ~2% of the detected spikes for that unit, which is an acceptable error that we expect to see occasionally as a tradeoff for automating the sorting.