MouseLand / Kilosort

Fast spike sorting with drift correction
https://kilosort.readthedocs.io/en/latest/
GNU General Public License v3.0
478 stars 248 forks source link

Multiple Issues After Running KS4 #787

Open EmmaCondon opened 2 months ago

EmmaCondon commented 2 months ago

Describe the issue:

Hello, I am encountering a number of issues with KS4 output that I have not seen before. I would like clarification before moving forward with spike sorting. If you prefer I make these separate issues please let me know.

Background

I am new to KS4 GUI and Neuropixels, previously I have run automatic spike sorting using KS2 & KS3 and manually curated with Phy2 on the following probe types: Cambridge Neurotech 16-channel single shank, NeuoNexus 32- and 64-channel single shank, and NeuroNexus 128-channel multi shank probes. As I am new to KS4 and Neuropixels, I am unsure if what I am seeing is normal, but gauging from the Neuropixels slack and other issues here on github, my results do not look right. Additonally, these results are consistent across a number of recordings.

I am running KS4 via the GUI on acute recordings from NP1 probes (lasting approx 2 hours), not changing any parameters, and using phase3B1 channel map. Here is a view of the GUI after I load the recording. I have not encountered any errors while running KS4. Following automatic spike sorting using KS4 I am manually curating using Phy2. KS4 GUI

Issue 1: Overmerging of units

During manual curation in Phy2 I am seeing very clearly in amplitude view and feature view (and sometimes in waveform view) that a unit labelled as good or MUA by KS4 is made up of two, separate units. You can see that after splitting, the autocorrelogram and waveforms of the units are different.

clearly 2 units before split clearly 2 units after split

Here is another example before and after splitting. 2 units before split 4 2 units after split 4

2 units before split 6 2 units after split 6

However, in other cases when I split, the CCGs and ACGs are the same, indicating that they are the same unit. unsure 2 units before split unsure 2 units after split

Question: Could you please confirm that this is indeed overmerging and that there splitting here is the best course of action? I have alot of units like this and it is taking much time to split. My PI suggests that this is not actually 2 different units and splitting should not be done so clarification would be helpful.

Issue 2: Amplitude and PCs appearing as single line and dot

Another very common issue I am having is the appearance of the amplitude as a singular flat line, with a singular dot appearing in feature view. Here is an example,

Here are some examples: amplitude as line 2 amplitude as line

And sometimes these line overlap, and when split are two different units but still appear as a line in amplitude and feature views. PC as cross before split PC as cross after split

Question: Are these units still considered acceptable? My thinking is that this is just a scaling issue with Phy but I have not seen this issue mentioned here in issues.

Issue 3: Waveform appearing on incorrect channel

In a number of different recording sessions, the channel number assigned to the unit does not match up in waveform view.

In the example below, you can see that this unit is labelled as 137 but in the waveform view, there is no waveform on 137, but it appears clearly on 133. wrong channel labels

Here is another example, with channel 133 labelled in waveform view as 129. wrong channel labelled MUA

Question: Is this due to an issue with the channel mapping? I hesitate to classify these labels as I do not want a mismatch in the channel numbers when analsying the data post spike sorting. I have seen an issue raised here (https://github.com/cortex-lab/phy/issues/866#issue-474267297) but this is related to KS2 and I am unsure if it can be applied to KS4 run via the GUI.

Issue 4: Units labelled as good by KS4 actually MUA

I know that it is common that incorrect labelling can occur in the automatic spike sorting process. However, well over half the units KS4 has assigned to as "good", as actually MUAs and have a clearly contaminated refractory period. For example, in a recording session in which KS4 has assigned 517 units as 'good', following manual curation Phy only 167 units are 'good' with the remainder having a refractory period contaminated by more than 20%.

Here are a few examples. KS4 labelled good but MUA 1 KS4 labelled good but MUA 2 KS4 labelled good but MUA 6 KS4 labelled good but MUA 7

Question: Has this issue been reported elsewhere? It seems KS4 is over classifying units as 'good' even with clearly contaminated refractory periods. I have seen this previously while spike sorting (not using NP and using KS2) but never to the magnitude I am seeing now.

Reproduce the bug:

No response

Error message:

No response

Version information:

Operating System: Linux Ubuntu 18.04.6 Kilosort: 4.0.6 Python: 3.9.19 Phy: 2.0b5

jacobpennington commented 1 month ago

@EmmaCondon From your GUI screenshot, it looks like there's an issue with the data. There is a lot of coordinated noise across channels (notice the horizontal bands that all look very similar). That is most likely why a lot of the units in your phy screenshots have waveforms that don't look like spikes. You would need to address that problem with the data first, otherwise the sorting results are not going to be interpretable.

EmmaCondon commented 1 month ago

Hi @jacobpennington thank you for the speedy reply and my sincerest apologies I am just responding now! Yes apologies, I see that now in a number of images I have sent it is from a recording with some noise. The noise was present as I ran KS4 across all the recording, but this included 15 mins where the probe was settling prior to behaviour starting.

I have edited the original post, replacing the Phy images with updated images which do not include the settling period (tmin=900) as the same issues are occurring.

jacobpennington commented 1 month ago

Issue 1: I'm going to wait for some input from @marius10p before trying to answer whether that's a case of overmerging or not, there are several things that might be going on there.

Issue 2: This looks like some kind of recording artifact with a fixed amplitude, maybe electrical noise from equipment, that got merged with some MUA spikes.

Issue 3: My understanding is this will happen if your channel map is something other than [0, 1, 2, ... N] for the reason mentioned here:

The numbers displayed under "channel" in cluster view seems to be the index that points onto a physical recording site that is stored in the "channel_map.npy". However the numbers displayed in waveform view are the converted recording sites (i.e. "channel_map(channel)"). It would maybe make sense to convert the "channel" numbers in cluster view similarly for better comparison.

In other words, in your first example channel 133 would be the channel at index 137 in the channel map. You could confirm that by looking at your probe file.

Issue 4: At least a couple of those examples look like they contain the same recording artifact I mentioned in response to 2. I would recommend talking to your PI to figure out what might be causing that and see if there's a way to remove it from your data prior to sorting, which would probably clear up a lot of the mislabeling. The first example also looks like it's contaminated with noise of some kind (notice the periodicity of those clear peaks in the correlogram), and the fourth dropped out less than halfway into the recording. Based on the other spikes in the plot, it looks like something pretty drastic happened during the experiment around t=4000 to cause that. Regardless, drop-out will make classification more difficult.

jacobpennington commented 1 month ago

Sorry, a couple updates after looking at this again:

The periodic peaks in the correlograms do actually look reasonable, they're probably from real oscillatory activity. As for the strange amplitude distributions, I just noticed that it's centered at 0 and actually includes negative amplitude values, so there's something pretty weird going on there. It might be from some other issue rather than a recording artifact.

Can you please include a screenshot of what the data looks like in the GUI after you excluded the settling period, and upload kilosort4.log from the results directory?

EmmaCondon commented 1 month ago

Hi @jacobpennington thank you so much for the replies, I really appreciate your insight and feedback. Of course, please find screenshot of the data in GUI after settling period excluded (whitened and raw versions).

Main GUI Whitened Main GUI Raw

I actually don't have a kilosort4.log in my results directory?! I've checked the kilosort4 folder of all other recordings I have run through the KS4 GUI and I also do not have kilosort4.log there. Screenshot below showing what I have in my output folder after running KS4 GUI (manually saved images).

KS4 ouput folder

jacobpennington commented 1 month ago

Which version of Kilosort4 are you using? You can check with conda list kilosort in a terminal.

Also, how are you running kilosort when you sort the data? Are you using the GUI or the API? Are you using SpikeInterface?

On Wed, Oct 9, 2024, 6:38 AM EmmaCondon @.***> wrote:

Hi @jacobpennington https://github.com/jacobpennington thank you so much for the replies, I really appreciate your insight and feedback. Of course, please find screenshot of the data in GUI after settling period excluded (whitened and raw versions).

Main.GUI.Whitened.png (view on web) https://github.com/user-attachments/assets/7e8ce74c-785c-4731-8b55-8f4793b38465 Main.GUI.Raw.png (view on web) https://github.com/user-attachments/assets/728e7507-885a-470c-9faf-3e67cd7de65d

I actually don't have a kilosort4.log in my results directory?! I've checked the kilosort4 folder of all other recordings I have run through the KS4 GUI and I also do not have kilosort4.log there. Screenshot below showing what I have in my output folder after running KS4 GUI (manually saved images).

KS4.ouput.folder.png (view on web) https://github.com/user-attachments/assets/b87c325b-4204-4e0b-90db-acb0bd3fcbd1

— Reply to this email directly, view it on GitHub https://github.com/MouseLand/Kilosort/issues/787#issuecomment-2401953941, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIQ6WYFG2ZAIVXEFAMADFITZ2UBQPAVCNFSM6AAAAABOJFPZT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBRHE2TGOJUGE . You are receiving this because you were mentioned.Message ID: @.***>

EmmaCondon commented 1 month ago
jacobpennington commented 1 month ago

Ah... that could explain a lot. Please update to the latest version of Kilosort (v4.0.18). You can do that with pip install kilosort --upgrade. Then try sorting again, and let me know which problems still show up if any (and upload the log file, which should be there after sorting with the new version).

On Wed, Oct 9, 2024, 8:06 AM EmmaCondon @.***> wrote:

  • Kilosort Version 4.0.4
  • Running KS4 via the GUI. I have never used Python, and I am not familair with Python, so I use the GUI.
  • Not using SpikeInterface.

I have never used Python, and I am not familair with Python, so I use the GUI.

— Reply to this email directly, view it on GitHub https://github.com/MouseLand/Kilosort/issues/787#issuecomment-2402136691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIQ6WYFUDZWSF6Q5WJOAEJLZ2UL5ZAVCNFSM6AAAAABOJFPZT6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMBSGEZTMNRZGE . You are receiving this because you were mentioned.Message ID: @.***>

EmmaCondon commented 1 month ago

Thanks Jacob! I updated to the latest version (nice feature for being able to turn CAR on and off btw).

Before sending you anything from Phy/output, when running Kilosort (v4.0.18) GUI, a few errors occurred:

These errors did not stop KS from running, but I'm wondering if perhaps they can affect the output? I have attached the kilosort4.log. If you could have a look at the errors, I would really appreciate it. kilosort4.log

jacobpennington commented 1 month ago

I'm not sure what would cause the "not responding" error, I'll try to reproduce it. That wouldn't interfere with the sorting, though. As for the other errors, do you have screenshots of them? There are no errors reported in the log.

EmmaCondon commented 1 month ago

I think the not responding error is due to our PC. Yes of course, here are the screenshots of the errors: Screenshot from 2024-10-10 16-45-35 Screenshot from 2024-10-10 16-44-26

jacobpennington commented 1 month ago

Thanks. Unfortunately those error messages are not very helpful, hah. Two questions:

1) Can you please tell me what kind of graphics card you're using? 2) If possible, could you try sorting with device='cpu' and see if you get a more informative warning at those spots? It will take a lot longer to run, but often pytorch will give more useful information when running on CPU for those obscure CUDA errors.

EmmaCondon commented 3 weeks ago

Hi @jacobpennington , once again apologies for the delay in responding!

  1. Intel® HD Graphics 630 (KBL GT2)
  2. Sure, I am running KS4 GUI now overnight with device='cpu'. I´ll update you once the sorting is done!
EmmaCondon commented 3 weeks ago

Hi @jacobpennington, unfortunately, running KS4 GUI with device='cpu' was not successful so I do not have any more information about the errors. It was running for 36 hours and crashed the computer.

jacobpennington commented 3 weeks ago

Hmm okay, sorry for the trouble. Are you able to share the data so that I can try sorting it myself? I'm not sure how else to proceed with debugging this.

EmmaCondon commented 3 weeks ago

Yes of course @jacobpennington, thank you! What is the best way to send you the data?

jacobpennington commented 3 weeks ago

Either a google drive or dropbox link. You can paste the link here, or e-mail it to me at jacob.p.neuro@gmail.com