Open FloopCZ opened 1 year ago
Where is state coming from? Was that meant to be input? just making sure.
// Evaluate input matrix.
input.eval();
double* pinput = state.device<double>();
If it is correct then state should be unlocked instead of input.
Sorry, it is a typo when copying the example, the state
is the input
(it is called state
in the original program). Yes, the function unlocks the input
and it does not use any external resources, it is a standalone function utilizing only its parameters and returning the output
.
Hi @syurkevi , I believe I found the source of the issue. I noticed that the issue only arises when there are multiple GPUs in the system. In single-gpu systems, it works as expected, which suggested that there may be a problem with the device id or stream id. Peeking at getStream(id)
function here gave me the impression that the function expects arrayfire id of the gpu, not the native CUDA id. Indeed, using the arrayfire id instead of the native id fixes the issue. If you confirm my suspicion, we can fix the documentation of CUDA interoperability.
Yes, confirming that the id is expected to be the internal ArrayFire id. Our example is wrong. The documentation for getStream does detail the expected id type. The interop.md looks like it was incorrectly changed at some point and should be reverted. Ideally this code should be added to our tests and forwarded to the docs through a snippet.
Hi, I have a question regarding custom CUDA kernels and synchronization. I tried to proceed as described in Interoperability with CUDA which states:
My code is more or less as follows:
I am using the same stream as ArrayFire, however, the program produces invalid results unless I manually run
cudaStreamSynchronize(af_cuda_stream)
after launching the kernel. Am I doing something wrong? Thank you.