Closed florian6973 closed 1 year ago
Which caching library?
Yes, Numpy extractors are not meant to be pickled as they have a lot of content. I think we could modify them sort of easily to make this possible but I am not sure it is a good idea. They have all the contents in memory so it will be a slow process that will probably induce sub-optimal use of the library. Let's see what @alejoe91 and @samuelgarcia think about it.
Meanwhile, as far as I understand, the workaround is something like this:
sorting = si.NumpySorting.from_times_labels([350, 1500, 2500], [0, 1, 0], 20000)
serializable_sorter = sorting.save()
Thanks for reply and these explanations! The workaround works perfectly, thank you very much!
The caching library is flask-caching; for small recordings I was thinking it may be easier to perform operations in memory.
Are you doing something with recordings in a web app? : O Feel free to share, it is useful for us to know about the difference use cases that people have for the library.
Yes indeed, in the same idea as sortingview :) Unfortunately I cannot in the details but if I can contribute somehow to spikeinterface in the future I would be glad ;)
@alejoe91 @samuelgarcia
To close this issue we need to make a decision: Do we want NumpySorting to be pickable? It is trivial to do it (we need to add _kwargs) and after the improvements in https://github.com/SpikeInterface/spikeinterface/pull/1674 I feel more confident that it will be not horrible performance wise.
If we decided against then I will add a more informative error.
@h-mayorquin I think that NumpySorting
can be dumpable, but not NumpyRecording
(because traces are always going to be large). What do you think?
Yeah, part of me wants the symmetry of both NumpyExtractors doing the same but if we can get this hurdle out of our users noses it seems like a good thing. Plus, other SortingExtractors are non-lazy as well (Phy and Mearec for example).
So, OK, I will do two things.
1) NumpySorting should be pickable. 2) NumpyRecording should throw an informative error if someone calls pickle dump on them.
Hi Ramon, about NumpySorting you can make small patch to change the flag but do nto make deep refactor on this because I will make deeper change on mem sharable numpysorting with more change quite soon
And maybe we could also allow to make numpyrecording dumpable with a warning when the size is too big.
This should be solved now that we have separated the flags for json serialization and pickling:
https://github.com/SpikeInterface/spikeinterface/pull/1775
@florian6973 let us know if a similar problem arises.
Hi!
I am using a caching library which saves the data by serializing it using pickle. Unfortunately, it crashes when I use a NumpySorting instance, with the error message:
Here is a minimal working example which reproduces the error:
I am using Python 3.9 and spikeinterface 0.97.1 Edit: The issue might be related to this line https://github.com/SpikeInterface/spikeinterface/blob/main/src/spikeinterface/core/base.py#L568
Is there a workaround in the meantime to be able to cache a NumpySorting object? In fact, there is a field
is_dumpable=False
, so I assume I would not have this issue with a NpzExtractorSorting since data are not in memory?Thanks!