Open fwyzard opened 1 year ago
My gut feeling is that we should mirror / mimic the atomic operations defined by the C++ standard. I guess that would require an implementation of std::atomic_ref
at some point, though.
My gut feeling is that we should mirror / mimic the atomic operations defined by the C++ standard.
Uhhmm, ..., let's do a smaller subset :)
Given @fwyzard's table, I think almost all of the the listed columns and rows should be supported. The only weird one IMO is the inc/dec with range. But since we can emulate any atomic operation with a CAS, it's basically just a matter of development effort.
Uhhmm, ..., let's do a smaller subset :)
The [u]int8_t
and [u]int16_t
would be annoying to implement, because none of the GPU runtimes (CUDA, HIP, SYCL) has 8-bit or 16-bit atomics, so even the CAS loop would need some extra bit masking.
But in turn that means that there shouldn't be any GPU code that relies on them, so it should be safe enough to leave them out.
The only weird one IMO is the inc/dec with range.
Those are native to CUDA and HIP, so I think we should keep them, at least for uint32_t
, which is the only type supported by CUDA and HIP.
We have some use for them, though most of the time we set the range to 0xffffffff
. But they can be implemented more efficiently than atomicAdd(ptr, 1)
, so it's good to keep them.
IMO as @fwyzard said we should take care that 32bit and 64bit methods are available because others are not supported on HIP/CUDA. Increment and decrement with range is nice to have but from our side, we never had a use-case for these functions.
While reviewing the implementation of the atomic operations in SYCL, I started comparing what operations are available on CUDA, HIP and SYCL:
What operations should be supported by Alpaka ?