orzzzjq / Parallel-Banding-Algorithm-plus

Compute the exact Euclidean Distance Transform and Voronoi Diagram for 2D and 3D binary images using the GPU.
https://www.comp.nus.edu.sg/~tants/pba.html
MIT License
73 stars 8 forks source link

Dimensions not multiple of 32 #3

Closed andmax closed 4 years ago

andmax commented 4 years ago

Is there a way to run your PBA+ algorithm with dimension sizes that are not multiple of 32? In addition, for pba3D, it seems that the size of dimension X and Y need to be equal. Do you have any hint of how to change your code to support any dimension sizes?

Thanks, Andre.

orzzzjq commented 4 years ago

My code only runs for squared images of sizes of multiple of 32. There are transpose operations in the process, this is for efficient memory access. And 32 is the size of a CUDA warp. It's possible to modify the code for availability for 2D images of size $32n \times 32m$ in theory, but needs to do complicated modifications. For pba3D, it actually can be viewed as many layers of 2D images. You can try to modify it to run for $32n \times 32n \times m$. Related functions for the $m$ can be found here: https://github.com/orzzzjq/Parallel-Banding-Algorithm-plus/blob/2f58720302c3478d4367b21268dfbb263b46c4d2/pba-plus-3D/pba/pba3DHost.cu#L94-L101

orzzzjq commented 4 years ago

For non-squared images, you can put it into a squared image of size $32n \times 32n$. I believe the performance won't be worse than directly run for the original size if the shape of the object is not too narrow. Related questions: https://github.com/orzzzjq/Parallel-Banding-Algorithm-plus/issues/2#issuecomment-635105467

andmax commented 4 years ago

Hi, thanks for your answer and comments. Yes, I was able to run for any squared image size, by doing several dimension checks in kernels, but remaining with the square restriction due to the transpositions. That is ok, thank you.