Closed kimwalisch closed 4 years ago
Actually our signed branchfree algorithm already supported +1 and -1 dividers, we were just unaware of it. I have now removed the checks that disallow using +1 and -1 as signed branchfree dividers in libdivide.h
and enabled +1 and -1 testing for signed branchfree dividers.
Currently the branchfree divider cannot be +1 and -1 (our current implementation calls
exit(-1)
if the user tries to generate a branchfree divider for +-1). When the branchfree divider was first implemented 3 years ago we did not find a way to support +-1 dividers and our branchfree algorithm currently returns 0 if the divider is 1.But the recent pull request https://github.com/ridiculousfish/libdivide/pull/49 from @yb303 contains proof of concept code that shows that it is possible to support +-1 branchfree dividers. The basic idea is to add the numerator to the result if the divider is 1 (in this case the result is 0).
In order to efficiently and cleanly implement +-1 branchfree support the best solution seems to be to add a new
int8_t one
bitmask to the branchfree divider structs that is set to -1 if the divider is 1 and to 0 otherwise. Using this newone
bitmask it is possible to conditionally add the numerator to the result using just 3 additional assembly instructions on x86.If the algorithm is implemented as shown above there is a performance slowdown of about 20%. If the algorithm is implemented without adding a new variable to the branchfree structs there is a performance slowdown of 35 to 40% (and the implementation must use hacks because there are no bits left in the
more
variable). Adding a new field to the branchfree structs is not ideal from a clean code perspective since we cast branchfree dividers to ordinary dividers inlibdivide.h
.Personally I am using the
u64
branchfree divider in two of my projects and I don't need +-1 support. Hence for my personal use case not supporting +-1 is actually a feature since my code will run faster without it. Hence for the time being I will not implement +-1 branchfree divider support.If you would like to use the branchfree divider in your project and you need +-1 support, please let us know and leave a comment below. You can also leave a comment if you don't care about +-1 branchfree divider support and prefer better performance.