MouseLand / Kilosort

Fast spike sorting with drift correction for up to a thousand channels
https://kilosort.readthedocs.io/en/latest/
GNU General Public License v3.0
419 stars 223 forks source link

Kilosort4 takes a lot of time to do the sorting #630

Closed Lorena-J closed 2 months ago

Lorena-J commented 2 months ago

Describe the issue:

I am trying to analyse with kilosort4 a 32gb log, in kilosort 3 this process takes me 2 hours but kilosort4 takes 5h and it is not finished yet, I don't know why this slowness can be. I have 32GB of RAM and 512 of hard disk.

jacobpennington commented 2 months ago

What kind of GPU are you using, if any?

Lorena-J commented 2 months ago

I'm using the Nvidia GTX3060

marius10p commented 2 months ago

@jacobpennington This might be related to issue #635.

jacobpennington commented 2 months ago

@Lorena-J We'll need more information to figure this out. Can you give us some details about the data and probe you're using?

MarcoUniPr commented 2 months ago

Hi, I'm encountering a similar issue. In Kilosort 3, sorting the same 32-channel dataset with a linear probe (32x1) takes about 4 minutes. Here are some parameters I used:

I'm using an NVIDIA GeForce RTX 3050 Ti Laptop GPU. In Kilosort 3, I can find very nice single units, and I can follow them clearly in Phy.

However, when I try the same dataset in Kilosort 4 with the same parameters, it takes hours.

How is this possible?

Thanks for any solutions you can share.

MarcoUniPr commented 2 months ago

Furthermore, if I use a different threshold for example [9 8] or [9 6], I can only detect a small subset of units. In other words, I may lose clear single units (SUA) in some channels.

marius10p commented 2 months ago

@MarcoUniPr this is a different issue, related to #638, so please see what we wrote there. Having a second threshold of 1 should not really be good under any circumstances, because it will mean the detected events are pretty much at all timepoints. Unless there is something different than usual in your data, in which case we'll ask you to open a separate issue and include pictures of the comparisons between thresholds of [9, 1] and [9, 6] illustrating single units that you lost that way.

Lorena-J commented 2 months ago

Hi, @jacobpennington, thanks for responding!!!! I am trying to process logs with a Cambridge Neurotech H9 probe with 64 electrodes in one shank, the data takes up about 32Gb when I convert the .continuous data obtained with openephys to the .bin structure that kilosort needs. I have changed hard disk to use it as swap and now I have 2TB and it is still running at idle.

jacobpennington commented 2 months ago

@Lorena-J I see the problem. The probe you're using has contacts placed in a checkerboard pattern, correct? We found a bug with the automatic dminx calculation for that type of layout. The most recent version of the code gets rid of that behavior, but you should also be able to fix it by setting dminx yourself in the GUI under "extra settings." Try setting it to the lateral spacing between the contacts on your probe.

Lorena-J commented 2 months ago

Thank you very much for your help, I have done the analysis with the code modification and it has worked. Now I have another problem, what I see in PHY after passing the file through kilosort does not make sense, the waveforms are not coherent (the recording is made in prefrontal cortex and during the experiment itself we saw a lot of neuronal activity).

Since the error I mentioned before was due to the probe, I don't know if there could be problems due to our setup. In the setup the cambridge neurotech probe is connected to an adapter, a connector and two intan, which are connected to the open ephys acquisition board.

To make the channel map I have used the channel correlation between the probe and the open ephys (after passing through the adapter, connector and intans). As the intan 1 (back) is the first one we have connected to the open ephys acquisition board, these are the first channels we have put in the channel map.

I attach both the correlation of the channels in the setup and the .prb file that I have used for the sorting. Is there any error in this way of doing it? setup

prb file

jacobpennington commented 2 months ago

@Lorena-J It looks like there might be an issue with the way the .prb file is formatted, if I'm reading your spreadsheet correctly. This is what I get when I load the probe into Kilosort4:

{'chanMap': array([16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 48,
        49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63,  0,  1,
         2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 32, 33, 34,
        35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]),
 'xc': array([ 1. ,  1. ,  1. , 22.5, 22.5,  1. ,  1. , 22.5, 22.5,  1. ,  1. ,
        22.5, 22.5, 22.5,  1. , 22.5,  1. , 22.5,  1. , 22.5, 22.5, 22.5,
        22.5,  1. , 22.5,  1. ,  1. , 22.5,  1. ,  1. , 22.5, 22.5,  1. ,
         1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,
         1. ,  1. , 22.5,  1. ,  1. , 22.5, 22.5, 22.5, 22.5, 22.5, 22.5,
        22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5, 22.5,  1. ],
       dtype=float32),
 'yc': array([2.2500e+02, 5.4000e+02, 4.9500e+02, 5.6250e+02, 4.7250e+02,
        5.8500e+02, 4.5000e+02, 6.0750e+02, 4.2750e+02, 6.3000e+02,
        4.0500e+02, 6.5250e+02, 3.8250e+02, 6.9750e+02, 3.6000e+02,
        7.4250e+02, 4.5000e+01, 5.1750e+02, 2.7000e+02, 2.4750e+02,
        2.2500e+01, 1.5750e+02, 6.7500e+01, 9.0000e+01, 2.9250e+02,
        3.1500e+02, 1.0000e+00, 3.3750e+02, 1.8000e+02, 1.3500e+02,
        2.0250e+02, 1.1250e+02, 9.4500e+02, 1.1250e+03, 9.9000e+02,
        1.0800e+03, 1.3950e+03, 7.2000e+02, 1.3500e+03, 1.3050e+03,
        9.0000e+02, 8.5500e+02, 1.0350e+03, 7.6500e+02, 1.2150e+03,
        1.2600e+03, 1.1025e+03, 8.1000e+02, 6.7500e+02, 7.8750e+02,
        1.4175e+03, 8.3250e+02, 1.3725e+03, 8.7750e+02, 1.3275e+03,
        9.2250e+02, 1.2825e+03, 9.6750e+02, 1.2375e+03, 1.0125e+03,
        1.1925e+03, 1.0575e+03, 1.1475e+03, 1.1700e+03], dtype=float32),
 'kcoords': array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], dtype=float32),
 'n_chan': 64}

Which says, for example, that the first row of the data should correspond to the 17th probe channel and has x-position 1um, y-position 225um (using the first value in chanMap, xc, and yc). But in your spreadsheet, the 17th channel should have x-position 1um, y-position 945um. I'm guessing it's something to do with how the values are inputted into the .prb file, but I'm not sure. Have you tried creating a .json probe file instead following this: Creating a Kilosort4 probe dictionary? That would probably be an easier way to make sure that Kilosort is interpreting your probe correctly.

Lorena-J commented 2 months ago

Thank you so much for all the help and for the speed of response!!! it worked perfectly!!! By changing the channel map the way you told me I was able to visualise the neurons and their activity, again, thank you very much!!!