Open jni opened 5 years ago
Thanks for the pointers! I remember that I already implemented something along these lines some time ago (i.e. trivially porting the NVIDIA samples to opencl). Reading the issue you linked, the tricky part might be to incorporate Kahan summation into the parallel scans. Without that, it should be rather straight-forward...
Ok, integral images are now part of the develop branch :) Only works for 2D/3D images though (which might not change).
import numpy as np
from gputools.transforms import integral_image
integral_image(np.array([[4, 8, 0], [9, 8, 8]], np.int32))
array([[ 4, 12, 12],
[13, 29, 37]])
For large images/volumes, I see roughly a 7-10x speedup with device-host memory transfer included, and 30-40x without. So it was well worth it! :)
Wow, speed! :rocket:
I had a look at the commits. Definitely gonna need a crash course on OpenCL kernels from you when you're here. =D I noticed a lot of isinstance
to dispatch appropriately. Possibly these things are fixable using the __array_function__
protocol, but I'm not sure at the moment.
Anyway, this is awesome! Now we have three things to dispatch in skimage. Stretch goal: somehow thread the GPU arrays through skimage functions so that we only pay the transfer cost at the start and end of a pipeline.
@maweigert how often do you merge develop back into master? We just have a master branch in scikit-image, which works pretty well for us.
I just did! :)
Cool! And :wave:! You owe me an email, ;) but in the meantime, just to let you know, we are going to have an undergrad work on the skimage-gputools interface, as well as maybe bringing over some kernels from CLIJ. Should we work on PRs here or should we work on my fork for a while? I get the sense that you don't have a lot of bandwidth for this lib these days...
We have some issues with our naive (non-parallel) integral image implementation in scikit-image. At the same time, we have many functions (template matching, thresholding) that depend on it, so it would be awesome to have a GPUtools implementation. (Ideally before March. ;) Here are some references for the implementation:
https://dspace.mit.edu/openaccess-disseminate/1721.1/71883
(note though that they use exclusive integral images while we use inclusive ones in skimage):
It might also turn out to be way more complicated in nD...
Other relevant paper:
https://ieeexplore.ieee.org/document/8249556