Closed ChipsSpectre closed 4 years ago
Hey, thanks for trying RustaCUDA.
I don't think there is such a synchronization issue. RustaCUDA isn't really doing anything inside the launch macro that would require extra synchronization, just launching the kernel with cuLaunchKernel. You can check this for yourself if you'd like; the code for the macro is in function.rs
which calls the launch function in stream.rs
. I suppose it's possible there's a bug in CUDA, but I feel like a bug in such basic functionality is unlikely to have gone unnoticed.
It's more likely that there's a bug in your kernel code which happens to be hidden by changing the grid and block dimensions.
Hey,
I implemented a simple kernel (just kopies each pixel of an image) and issues in the lower part of the image:
The black stripes at the bottom of the image are different on each call, but always get larger from top to bottom. Therefore I assume that
stream.synchronise()?;
has an issue for multi-dimensional kernel launches like this:(note: the image size is 640x480 pixels)
How can the synchronisation issue be solved? Should I restrict my kernels to 1-Dimensional block and thread dimensions?
elimininates the issue.