Open fwyzard opened 2 years ago
No idea, we have to investigate this first. If it is possible I see no reason against this.
In general, it should be possible. We could store it as a member of the accelerator.
The question is if alpaka should provide such an interface. The size of the dynamic shared memory is set by the user before submitting the kernel so if this knowledge is required within the kernel the user should pass it as an argument. So if the user is fine with the additional overhead of passing one more variable to the device he/she can do it but by default, it will not affect any other kernel/application.
I do not see a problem to have this function only activated when the debug mode is enabled but we should add an static assert in case someone is using the function outside of the debug mode, a clear error message will avoid a lot of debugging pain.
At least for debugging purposes it would be useful to be able to query the the size of the dynamic shared memory from within the device code.
In CUDA this can be done with some inline PTX (see https://stackoverflow.com/questions/40021086/can-i-obtain-the-amount-of-allocated-dynamic-shared-memory-from-within-a-kernel/40021087#40021087):
With ROCm I have no idea :-)
Is this something that Alpaka could wrap, and provide also for the CPU-based backends ?