kiyo-masui / bitshuffle

Filter for improving compression of typed binary data.
Other
219 stars 76 forks source link

Expose the `out` parameter to avoid malloc of output array #147

Closed kif closed 1 month ago

kif commented 1 year ago

This patch offers the user the ability to provide an output buffer for all functions exposed at the Python level. This feature avoids the array creation and permits to recycle explicitly temporary buffers.

Despite the Python memory allocator is supposed to recycle memory in the back of the programmer, it fails sometimes, for example in timeit mode where the garbage collector is disabled. Under such situation the code can becomes terribly slow and causes huge memory leaks.

kiyo-masui commented 1 month ago

Hello - CI was broken when you opened this pull request. It's fixed now, can you merge in the latest from master to trigger the CI again?

kiyo-masui commented 1 month ago

I don't think you should have removed the cdef np.ndarray out statements, since those should be required for the threaded sections. Have you done a detail performance comparison in a multithreaded context?

Tests pass. API does change (in a backward compatible way) so we would want to bump the minor version.

kif commented 1 month ago

The cdef np.ndarray out is no more needed and would break the declaration in the signature of the function, i.e. out would be declared twice which prevents the code to compile.

kif commented 1 month ago

Feel free to bump the version number, you are the maintainer.

kiyo-masui commented 1 month ago

The cdef np.ndarray out is no more needed and would break the declaration in the signature of the function, i.e. out would be declared twice which prevents the code to compile.

Oh I see. That makes sense.

Can you add a unit test for this functionality?

kif commented 1 month ago

About performances, it allows us to provided a pinned-memory region for fast transfer to the GPU. I also tested the patch on this notebook: http://www.silx.org/doc/pyFAI/dev/usage/tutorial/Parallelization/Direct_chunk_read.html