SpikeInterface / spikeinterface

A Python-based module for creating flexible and robust spike sorting pipelines.
https://spikeinterface.readthedocs.io
MIT License
493 stars 188 forks source link

memory usage problems in latest dev version using NWB #850

Closed lfrank closed 2 years ago

lfrank commented 2 years ago

We're experiencing much slower recording creation in the latest version and much increased memory usage.

The raw data are in NWB, and extracting ~90 minutes from a single tetrode ends up requiring > 80GB of RAM, even though the total memory was specified as 1G. It also seems to take much long than it did a month or two ago as far as we can tell.

Any thoughts would be appreciated.

lfrank commented 2 years ago

@bendichter @alejoe91 Is this something either of you would have time to look at? I tried looking through the recording.save function and it wasn't obvious to me where one might, for example, delete old chunks to save RAM and/or make things more efficient.

I should add that we're using only one job here because in the past it didn't seem to help to increase the number of jobs.

alejoe91 commented 2 years ago

Hi @lfrank

Could you share one of the datasets?

lfrank commented 2 years ago

@alejoe Thanks for checking in on this. We just discovered that it has to do with the h5py version: 2.10 is fast; 3.7 is slow and uses much more memory. I’ll pass this on to the pynwb people.

On Aug 1, 2022, at 5:36 AM, Alessio Buccino @.**@.>> wrote:

This Message Is From an External Sender This message came from outside your organization.

Hi @lfrankhttps://urldefense.com/v3/__https://github.com/lfrank__;!!LQC6Cpwp!syl-x-iztxqMEyExCmlpwWIu4u81Cn4Em0WUwFIcWBz48wf0PXtnT99ZekWwxQv08OOqwDICah91rrBGjlcIt5Ydww$

Could you share one of the datasets?

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/SpikeInterface/spikeinterface/issues/850*issuecomment-1201145109__;Iw!!LQC6Cpwp!syl-x-iztxqMEyExCmlpwWIu4u81Cn4Em0WUwFIcWBz48wf0PXtnT99ZekWwxQv08OOqwDICah91rrBGjlcrUlzu_g$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABV4PSP6PZTEPGJZT5IN24LVW7AEXANCNFSM543H6V3Q__;!!LQC6Cpwp!syl-x-iztxqMEyExCmlpwWIu4u81Cn4Em0WUwFIcWBz48wf0PXtnT99ZekWwxQv08OOqwDICah91rrBGjlduODtekg$. You are receiving this because you were mentioned.

alejoe91 commented 2 years ago

Closing then!

bendichter commented 2 years ago

@lfrank Interesting! We'll have to look into that

lfrank commented 2 years ago

As a follow up, we tracked this down to the version of h5py: 2.10 works well, 3.5, 3.6, and 3.7 all use 5-6x RAM and take 2-3x as long to write the recording.save.

For the moment our plan was to restrict our environment to version 2.10, but obviously that’s not a good long term solution.

On Aug 1, 2022, at 5:35 PM, Ben Dichter @.***> wrote:

 @⁠​lfrank Interesting! We'll have to look into that — Reply to this email directly, view it on GitHub, or unsubscribe.⁠​ You are receiving this because you were mentioned.⁠​Message ID:⁠​ <SpikeInterface/spikeinterface/issues/850/1201873713 ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization.

ZjQcmQRYFpfptBannerEnd

@lfrankhttps://urldefense.com/v3/__https://github.com/lfrank__;!!LQC6Cpwp!r9iBrTJfAF8AU0slhqKn-MUIc51d0jrLlJwx4zhg9tEmrHv1rQpFoY-I9X09K-Ep7kRTWjX6ISKPvcFqaUo3Iso8Zw$ Interesting! We'll have to look into that

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/SpikeInterface/spikeinterface/issues/850*issuecomment-1201873713__;Iw!!LQC6Cpwp!r9iBrTJfAF8AU0slhqKn-MUIc51d0jrLlJwx4zhg9tEmrHv1rQpFoY-I9X09K-Ep7kRTWjX6ISKPvcFqaUoRe_EQvA$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABV4PSMKZFUVE7GTLG4S3YLVXBULPANCNFSM543H6V3Q__;!!LQC6Cpwp!r9iBrTJfAF8AU0slhqKn-MUIc51d0jrLlJwx4zhg9tEmrHv1rQpFoY-I9X09K-Ep7kRTWjX6ISKPvcFqaUq4oA5h2g$. You are receiving this because you were mentioned.Message ID: @.***>