Closed catubc closed 5 years ago
Hi, how many filters do you run it with? Have you tried clearing the GPU just before the final step? You can do so with gpuDevice(1). You can also lower the batch size to half its default.
The RAM parameter is for system RAM, not GPU RAM.
I changed batch size and added gpuDevice(1) before final pass and the run completed fine. Thanks!
Hi Marius,
I am encountering a similar issue. I am trying to kilosort2 a 384 channels * 2h30 long recording (~200GB of data) and my GPU struggles to handle it.
The GPU memory error occurs during the splitting step:
...
Found 919 splits, checked 501/1479 clusters, nccg 272
Error using gpuArray/filter
Out of memory on device. To view more detail about available memory on the GPU, use
'gpuDevice()'. If the problem persists, reset the GPU by calling 'gpuDevice(1)'.
Error in my_conv2 (line 47)
S1 = filter(gaus, 1, cat(1, S1, zeros([tmax, dsnew2(2)])));
Error in splitAllClusters (line 56)
clp = clp - my_conv2(clp, 250, 1);
Error in master_kilosort (line 43)
rez = splitAllClusters(rez, 1);
Error in metamaster_MB (line 39)
master_kilosort(datasets{1}{1}, datasets{1}{2}, datasets{1}{3});
I initially used the tip above to reset the GPU before every stage of the master script (i.e. before clusterSingleBatches, before learnAndSolve8b, find_merges, splitAllClusters(1) and splitAllClusters(0)).
This logically led to this other error, this time at the merging step:
...
merged 136 into 137
Error using gpuArray/subsref
The data no longer exists on the device.
Error in splitAllClusters (line 20)
[~, iW] = max(abs(rez.dWU(ops.nt0min, :, :)), [], 2);
Error in master_kilosort (line 51)
rez = splitAllClusters(rez, 1);
Error in metamaster_MB (line 45)
master_kilosort(datasets{i}{1}, datasets{i}{2}, datasets{i}{3});
So my question is: at which stages of the master script can we or not reset the GPU using gpuDevice(1), in order to clear memory without throwing away necessary data for the subsequent step?
Thanks a lot for all the hard work!
PS: I also reduced the batch size to the minimum possible i.e. half the default setting, 32*1024+ ops.ntbuff (if I go lower I end up not having enough spikes per batch, which is incompatible with the drift correction, see this issue and the error below:
Error using gpuArray/eig
Input matrix contains NaN or Inf.
Error in svdecon (line 23)
[U,D] = eig(C);
Error in sortBatches2 (line 7)
[u, s, v] = svdecon(ccb0);
Error in clusterSingleBatches (line 150)
[ccb1, iorig] = sortBatches2(ccb0);
Error in MasterKiloSortLJB (line 59)
rez = clusterSingleBatches(rez);
)
PPS: I also needed to install extra RAM on my machine to be able to reach this point, Kilosort cannot process too long recordings on 32GB of RAM. I am not sure where they sit, but this could be stated somewhere in the hardware recommendations.
FYI here is the output of the command 'gpuDevice()' right after the crash during the splitting step (seems like there is a bunch of 'AvailableMemory' left but I am not sure to understand that properly):
gpuDevice()
ans =
CUDADevice with properties:
Name: 'GeForce GTX 1080 Ti'
Index: 1
ComputeCapability: '6.1'
SupportsDouble: 1
DriverVersion: 10.1000
ToolkitVersion: 10.1000
MaxThreadsPerBlock: 1024
MaxShmemPerBlock: 49152
MaxThreadBlockSize: [1024 1024 64]
MaxGridSize: [2.1475e+09 65535 65535]
SIMDWidth: 32
TotalMemory: 1.1811e+10
AvailableMemory: 9.2967e+09
MultiprocessorCount: 28
ClockRateKHz: 1582000
ComputeMode: 'Default'
GPUOverlapsTransfers: 1
KernelExecutionTimeout: 0
CanMapHostMemory: 1
DeviceSupported: 1
DeviceSelected: 1
I have a question about the final matching pass. Why I only batch 1/87, NTOT 114 and the kilosort did not detect the spikes for me when I use my own data.
Hi
I'm having a bit of trouble running kilosort to completion for ~25GB, 512channel dataset. I am on linux (ubuntu 16.04, with nvidia 410.78 drivers, Matlab R2018b, Titan XP 12GB). Kilosort used to run ok on R2017B and ubuntu 16.04. It also runs fine on smaller datasets
As far as I can tell, GPU memory is not being freed up before the last batch run (i.e. there appears to be ~4GB of GPU ram still loaded after the crash, not sure if that's a clue). I tried lowering the GPU memory parameter (ops.ForceMaxRAMforDat = 10e7) but seems I get the same errors. The CPU-only version seems to work fine.
The crash log is below, any advice is appreciated.
Thanks! Cat