Closed kif closed 1 month ago
Hello - CI was broken when you opened this pull request. It's fixed now, can you merge in the latest from master to trigger the CI again?
I don't think you should have removed the cdef np.ndarray out
statements, since those should be required for the threaded sections. Have you done a detail performance comparison in a multithreaded context?
Tests pass. API does change (in a backward compatible way) so we would want to bump the minor version.
The cdef np.ndarray out
is no more needed and would break the declaration in the signature of the function, i.e. out would be declared twice which prevents the code to compile.
Feel free to bump the version number, you are the maintainer.
The
cdef np.ndarray out
is no more needed and would break the declaration in the signature of the function, i.e. out would be declared twice which prevents the code to compile.
Oh I see. That makes sense.
Can you add a unit test for this functionality?
About performances, it allows us to provided a pinned-memory region for fast transfer to the GPU. I also tested the patch on this notebook: http://www.silx.org/doc/pyFAI/dev/usage/tutorial/Parallelization/Direct_chunk_read.html
This patch offers the user the ability to provide an output buffer for all functions exposed at the Python level. This feature avoids the array creation and permits to recycle explicitly temporary buffers.
Despite the Python memory allocator is supposed to recycle memory in the back of the programmer, it fails sometimes, for example in
timeit
mode where the garbage collector is disabled. Under such situation the code can becomes terribly slow and causes huge memory leaks.