Alpaka workdiv (gridsize, blocksize) more flexible

erikzenker commented 9 years ago

The current design is hardly coupled to a gridsize of 200 blocks and a blocksize of 128 blocks:

Because of the bitshift in our block based ray scheduling, we need to set the thread number to powers of two. A multiply has the same throughput than a bitshift and could be used instead to make arbitrary block sizes possible.
The mapPrefixSumToPrisms function assumes a large number of threads spawned, otherwise not all values of the prefixsum are mapped.

Are there other places in the source code that force special grid or block size ?

We should eliminate such places as much as we can!

erikzenker commented 9 years ago

Propgation of rays from prism centers to sample points (inside preimportance calculation) also assumes a large amount of threads (see here)

erikzenker commented 9 years ago

Fixed in topic-alpaka

ComputationalRadiationPhysics / haseongpu