SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
533 stars 187 forks source link

Optimizing spikeinterface speed #2818

Open m-beau opened 6 months ago

m-beau commented 6 months ago

Quick question: Kilosort 4 in a spikeinterface docker container took about 4hours to run on a 20min long Neuropixels dataset (Kilosort 2 on MATLAB on the same machine takes about 30 minutes), so I believe that my installation is sub-optimal.

Is running sorters in containers knows to slow down the process, or am I doing something wrong? Do I need to check whether 'all my my GPU capabilities are used' somehow (not sure if it could fall back on running on CPU, but this would take much longer than 4h so I don't believe this must be happening)? In general, any tips to accelerate spikeinterface's sorters?

And also, any way to print out a progress bar of some kind? That would be very useful.

Thanks!

zm711 commented 6 months ago

I would suggest benchmarking the docker speed on Kilosort2. It sounds like locally with Matlab you are about 1x recording time. So see what docker takes. That should give you a proxy of the docker slowdown. As far as the SI wrapper in general, that will also have a slight slowdown compared to native KS4 (this is expected since we are calling KS stuff from spikeinterface stuff). This lengthening of time is increased if you are doing things like run_sorter_by_property because it we split your recording so you analyzer each "shank" separately (so really for a four shank of 20 minutes you would have 80 min of recording). This should be a good thing since it provides more isolation, but it means that running this will be quite a bit slower than running KS4 natively (with a likely accuracy boost). Does that make sense? Basically the question we have to ask is what specifically are you sorting and how are you sorting it with spikeinterface?

Both run_sorter and run_sorter_by_property can have an optional verbose=True option to provide more info, but your mileage will vary. For example if you're running KS2 , 2.5, or 3 in docker then the python process has to hand the data off to the Matlab process and we can't control the progress of the Matlab stuff so no progress bar there. Same would be true if you use a python spike-sorter (like MS4 or MS5) they would have to have their own progress bar implemented in order for you to see that during sorting. If you have verbose=True a progress bar will display for spikeinterface things when possible.

Did I miss anything @alejoe91?

zm711 commented 6 months ago

I will say maybe (and we discussed this previously) we should have an option to write a binary and load that into KS instead of using their RecordingAsArray or whatever because that doesn't have multiprocessing so that might slowdown the process a bit. So in the case of someone sorting over the network it might make sense to just write the binary file where-ever they want and then sort from that for KS4 so that we can leverage our n_jobs, no?

m-beau commented 6 months ago

Thanks Zach, I will simply do some careful benchmarking to bring more useful information to the table!

It is a shame for the progressbars, it would be cool to hack the stdout of the sorters (or whatever the MATLAB equivalent would be) to somehow use their own 'verbose output' and assess progress. Or, when it is already implemented in the sorter, make it write on a .log file and read the progress from this .log file.

zm711 commented 6 months ago

That sounds like a fun PR if you want to give it a go! I'm sure we would have users that would be interested in that :)