DeepLearnPhysics / larcv2

MIT License
13 stars 16 forks source link

Multi-hreading larcv.larcv_threadio.next? #13

Closed drinkingkazu closed 6 years ago

drinkingkazu commented 6 years ago

Would be nice to multi-thread C++=>numpy data copy. Even better, would be nice to be able to attach a custom function for further data massage (maybe different re-shaping of data per application). It can be done either via inheritance or have a capability to dynamically attach function pointer? Maybe the latter is flexible but more dangerous.

drinkingkazu commented 6 years ago

... and I meant for this to be done within larcv_threadio, so @ python level

coreyjadams commented 6 years ago

Ok, I looked into this a bit. PyArray_SimpleNewFromData doesn't copy data:

Sometimes, you want to wrap memory allocated elsewhere into an ndarray object for downstream use. This routine makes it straightforward to do that. The first three arguments are the same as in PyArray_SimpleNew, the final argument is a pointer to a block of contiguous memory that the ndarray should use as it’s data-buffer which will be interpreted in C-style contiguous fashion. A new reference to an ndarray is returned, but the ndarray will not own its data. When this ndarray is deallocated, the pointer will not be freed.

So, PyUtils.cxx line 32 is fine and should not be multithreaded.

However, PyArray_Transpose probably does copy the data. So we need to address: return PyArray_Transpose(((PyArrayObject*)(PyArray_SimpleNewFromData(2, dim_data, NPY_FLOAT, (char *)&(vec[0])))),NULL);

Instead of multithreading this though, it makes much more sense performance wise to allow the BatchFiller for Image 2D to copy the transposed image into the batch memory. Then, the transpose by numpy is unnecessary and there is no need to multi thread this.

Doing it at batch filler has 2 advantages: It's already multithreaded, and we can include code to directly transpose at the time of filling. BatchFillerImage2D already supports mirror images, so it should be not hard to support transposed images as well.

As a fun bonus, this will also remove the need for the transpose in the viewer I think, which will speed that up too. Probably unnoticeable though ...

drinkingkazu commented 6 years ago

This is somewhat done by data_loader3.py where I implemented threaded function. I'll close this ticket in "asking forgiveness" style: if anyone wants more development on this, feel absolutely free to re-open.

drinkingkazu commented 6 years ago

9383d35714f942a3b73b75d3d92ef8f5b06d0feb