btel / SpikeSort

Spike sorting library implemented in Python/NumPy/PyTables
http://spike-sort.readthedocs.org
Other
27 stars 12 forks source link

Speed up align_spikes and extract_spikes via Cython #49

Closed cpcloud closed 12 years ago

cpcloud commented 12 years ago

extract_spikes and align_spikes are brutally slow right now. I think it would be useful to attempt to speed up the loops in these functions using Cython. Of course, if there's something I'm missing about how to make this faster using, e.g., PyTables (somewhat) fast I/O then please let me know.

btel commented 12 years ago

Could you describe the problem in more detail? What format is your data in? How long are the recordings? Do you use disk caching (such as memmapped arrays)?

cpcloud commented 12 years ago

Sure. Data are in PyTables format (HDF5) an entire recording is about 14,000,000 sample points at 24414.1 hz. I do not use disk caching.

btel commented 12 years ago

Do you use compressed arrays?

cpcloud commented 12 years ago

Yep.

btel commented 12 years ago

I guess this might be the bottleneck. Array based operations that are used in extract_spikes and align_spikes are close to C performance, so using Cython might not change much.

Make sure that your arrays are in C order (fastest-changing indexes are in the last dimension). You can also try different compression algorithms and levels (I think that blosc is quite fast). Digital filtering also slows things down considerably.

Edit: I should be more precise regarding the performance. I think that the bottleneck in spike extraction is actually the data access and not the CPU operations. Therefore, I guess that Cython optimization won't help much. PyTables can make things a bit faster thanks to the compression, but from my experience the gain is not huge (if any).

cpcloud commented 12 years ago

Yep, you're right. Unless there's some efficient way to access PyTables Arrays from Cython then I think I will just have to upgrade my 4-year-old MacBook!