alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration :llama:
https://alpaka.readthedocs.io
Mozilla Public License 2.0
340 stars 69 forks source link

dynamic shared memory trait #993

Open psychocoderHPC opened 4 years ago

psychocoderHPC commented 4 years ago

The traits BlockSharedMemDynSizeBytes is designed with a local view in mind. The number of threads and elements per thread where passed to the trait ref If you like to implement a global prefix sum as it could be helpful to know how many blocks are involved. The example is not running (syntax issues) but it shows a general way how to implement it.

IMO there is not reason to limit knowledge of the trait to information about a block even if shared memory can not by accesses by other blocks.

My suggestion: pass the workDiv to the trait instead of the thread extents and number of elements per thread.

sbastrakov commented 4 years ago

I like passing the workDiv.