NVIDIA / cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Other
6.47k stars 1.83k forks source link

Question on batchSize #302

Open gong-yuan opened 1 month ago

gong-yuan commented 1 month ago

I saw that batchSize appear in multiple places in the repo, e.g.

uint blockCount = batchSize * arrayLength / SHARED_SIZE_LIMIT;

in bitonicSort.cu.

Would it be possible to explain what does it mean? Given an array of arrayLength, why would we want to do it batchSize times?

Thank you so much for your consideration.

Best regards, Yuan

ghost commented 4 weeks ago

The expression (batchSize * arrayLength) seems to indicate the size of the original array. So, blockCount should be the number of blocks needed to work with such elements (keeping in mind the size limitation of shared memory).

Note that in all uses of batchSize in the definition of the bitonicSort function, it has been in expressions like this (batchSize * arrayLength). If you look at the invocation of bitonicSort, you will see that the argument for the batchSize parameter is (N / arrayLength). Remember that shared memory is allocated per thread block.

The link explaining the algorithm (in the file) is broken. The updated one is https://hwlang.de/algorithmen/sortieren/bitonic/bitonicen.htm