eyalroz / cuda-kat

CUDA kernel author's tools
BSD 3-Clause "New" or "Revised" License
104 stars 8 forks source link

Consider unifying the constexpr and non-constexpr math functions #60

Open eyalroz opened 4 years ago

eyalroz commented 4 years ago

Some of our math functions have two implementations: A more efficient one for runtime (e.g. using PTX instructions), and a less-efficient one which is constexpr. I keep them separate both by having two header files, and by having a constexpr_ namespace so that nobody calls the slower implementation by mistake at runtime.

However, in this StackOverflow question, several methods are suggested for detecting and choosing, at compile time, which variation of the function to run. Perhaps we could use that to just have a singel ug