Open Klummel69 opened 2 years ago
Hi @Klummel69. I can see the value in supporting those platforms. I'd indeed prefer one of the "bit hack" solutions over the naive loop, but I'd rather avoid adding preprocessor macros, so a) I'd want to benchmark the difference (it's possible the bit hacks are faster), and b) I'd like the code to be generalized to the templated type. I'll have to think about it, actually understand the bit hacks and experiment a bit.
When calling the function sqrt() I stumbled across a message from my compiler: Error: Unsupported intrinsic: llvm.ctlz.i64
Reason: the function find_highest_bit() uses intrinsic functions (for performance reasons). I use a modified gcc here which does not support intrinsic calls.
I have adapted my code as follows, possibly this would be also an option for the main branch:
By using a preprocessor switch FPM_NO_INTRINSIC you can prevent the use of intrinsic calls and at the same time the code works on many compilers which are not supported so far.
(Admittedly, the code is slow, if necessary I can include an optimized version with bitmasks).
One question: Is there a reason why find_highest_bit() returns a datatype long instead of int?