AllenInstitute / ecephys_spike_sorting

Modules for processing extracellular electrophysiology data from Neuropixels probes
Other
109 stars 91 forks source link

Memory Mapping Implementation #92

Closed zmingmin closed 1 month ago

zmingmin commented 1 month ago

When running the analysis with a 400GB dataset, the program experiences slowdowns, likely due to memory allocation issues. Is there a straightforward way to pre-allocate memory during parallel processing to improve performance? We use the default pipeline provided in the create_input_json.py

Thanks for any instructions!

jsiegle commented 1 month ago

Is there a particular step in the pipeline that's affected, or are they all running slow?

If you're looking to speed up processing times, I'd recommend migrating your spike sorting to use SpikeInterface, which is being actively developed and has many more features than ecephys_spike_sorting.

zmingmin commented 1 month ago

Thank you! The slowdown occurs during the final clustering step in Kilosort. I’ve come across some discussions about this in Kilosort. Thanks a lot for the suggestion to try SpikeInterface and I will definitely try.

jsiegle commented 1 month ago

Both ecephys_spike_sorting and SpikeInterface just provide wrappers around Kilosort -- so there's not much that can be done within these packages to speed up that step.

In any case, I think you'll find SpikeInterface to be a better option for building sorting pipelines, both in terms of flexibility and performance.