NVIDIA / cccl

CUDA Core Compute Libraries
https://nvidia.github.io/cccl/
Other
1.3k stars 164 forks source link

[FEA]: Add more benchmarks for `thrust::transform` #2814

Open bernhardmgruber opened 1 week ago

bernhardmgruber commented 1 week ago

We currently use the BabelStream kernels as memory bound workloads and a fibonacci kernel (per thread: read random index [0;42], compute fibonacci number, store result). The latter kernel is compute and shows high thread divergence.

In order to better assess regressions, we should add more benchmarks covering:

bernhardmgruber commented 2 days ago

I think we should actually make them CUB benchmarks, since they should be included in continuous CUB benchmarking and tuning.

bernhardmgruber commented 2 days ago

We can probably postpone this, because thrust::transform seems to be performing well.