Question on batchSize - Githubissues

NVIDIA / cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Other

6.47k stars 1.83k forks source link

The expression (batchSize * arrayLength) seems to indicate the size of the original array. So, blockCount should be the number of blocks needed to work with such elements (keeping in mind the size limitation of shared memory).

Note that in all uses of batchSize in the definition of the bitonicSort function, it has been in expressions like this (batchSize * arrayLength). If you look at the invocation of bitonicSort, you will see that the argument for the batchSize parameter is (N / arrayLength). Remember that shared memory is allocated per thread block.

The link explaining the algorithm (in the file) is broken. The updated one is https://hwlang.de/algorithmen/sortieren/bitonic/bitonicen.htm

NVIDIA / cuda-samples

Question on batchSize #302