hpcgarage / spatter

Benchmark for measuring the performance of sparse and irregular memory access.
Other
75 stars 15 forks source link

Remove unnecessary bounds checks in cuda kernels #180

Closed JDTruj2018 closed 8 months ago

JDTruj2018 commented 8 months ago

Unnecessary checks on (j < pattern_length) in the CUDA kernels, since j is calculated from j = total_id % pattern_length.

The only check necessary in all CUDA kernels is if (i < count) since i = total_id / pattern_length can be larger than count.