Make minor modifications in libdivide_u*_branchfree_gen(), libdivide_u*_branchfree_recover() and libdivide_u*_branchfree_do().
Implement a separate libdivide_internal_u*_branchfree_gen (probably some other functions need to be duplicated as well).
I chose the first solution because it requires less code and is much easier to implement. Though I have to agree that the second solution would be cleaner at the expense of duplicating some code.
Please let me know what you think about these code changes.
As discussed in https://github.com/ridiculousfish/libdivide/issues/12 I have made a modification to
libdivide_u*_branchfree_do()
so that it uses 1 instruction less.There are 2 possible ways to implement this:
libdivide_u*_branchfree_gen()
,libdivide_u*_branchfree_recover()
andlibdivide_u*_branchfree_do()
.libdivide_internal_u*_branchfree_gen
(probably some other functions need to be duplicated as well).I chose the first solution because it requires less code and is much easier to implement. Though I have to agree that the second solution would be cleaner at the expense of duplicating some code.
Please let me know what you think about these code changes.