bheisler / RustaCUDA

Rusty wrapper for the CUDA Driver API
Apache License 2.0
765 stars 58 forks source link

launching without launch! macro #24

Closed zeroexcuses closed 5 years ago

zeroexcuses commented 5 years ago

I have a weird bug (no minimal failure case yet) and I'm trying to de-magicfy each step.

Is there a full example somewhere of launching a kernel WITHOUT using the launch! macro?

bheisler commented 5 years ago

Thanks for your patience.

There's no example of this, no. Using the launch! macro is the only supported way to launch kernels in RustaCUDA.

If you want to use the undocumented method, you can use stream.launch like so:

stream.launch(&function, grid, block, shared,
    &[
      &arg1 as *const _ as *mut ::std::ffi::c_void,
      &arg2 as *const _ as *mut ::std::ffi::c_void,
      &arg3 as *const _ as *mut ::std::ffi::c_void
    ]
)

Keep in mind that this function may change without warning, so this is fine for debugging but I wouldn't recommend using it in production. Another option might be to use something like cargo-expand to expand the macro, then you don't have to write it yourself.

zeroexcuses commented 5 years ago

@bheisler

  1. Thanks for the detailed example.

  2. I figured out my problem. The problem was not launch! The problem was that I was not calling stream.synchronize()? . It turns out the DeviceAPI, when using non-default stream, is ASYNC :-)

This was quite fun debugging a kernel that "worked" when the input size was small and "failed" when the input # leems croseed a certain size.

Anyway, everything works now. I've successfully switched from Rust-Accel to rustacuda. Thanks for all your hard work!