Closed cpcloud closed 12 years ago
Could you describe the problem in more detail? What format is your data in? How long are the recordings? Do you use disk caching (such as memmapped arrays)?
Sure. Data are in PyTables format (HDF5) an entire recording is about 14,000,000 sample points at 24414.1 hz. I do not use disk caching.
Do you use compressed arrays?
Yep.
I guess this might be the bottleneck. Array based operations that are used in extract_spikes
and align_spikes
are close to C performance, so using Cython might not change much.
Make sure that your arrays are in C order (fastest-changing indexes are in the last dimension). You can also try different compression algorithms and levels (I think that blosc is quite fast). Digital filtering also slows things down considerably.
Edit: I should be more precise regarding the performance. I think that the bottleneck in spike extraction is actually the data access and not the CPU operations. Therefore, I guess that Cython optimization won't help much. PyTables can make things a bit faster thanks to the compression, but from my experience the gain is not huge (if any).
Yep, you're right. Unless there's some efficient way to access PyTables Array
s from Cython then I think I will just have to upgrade my 4-year-old MacBook!
extract_spikes
andalign_spikes
are brutally slow right now. I think it would be useful to attempt to speed up the loops in these functions using Cython. Of course, if there's something I'm missing about how to make this faster using, e.g., PyTables (somewhat) fast I/O then please let me know.