Open Kmeakin opened 2 weeks ago
So do you mean this unrolling works for u16?
So do you mean this unrolling works for u16?
No, I mean if n /= 16
instead of n /= 10
So do you mean this unrolling works for u16?
No, I mean if
n /= 16
instead ofn /= 10
That's expected, can you try after enabling epilogue vectorization? I am not sure if it is enabled by default.
Consider this function that calculates the number of digits in
n
's base-10 representation (eg as part of a formatting library):Since
n
is in the range0-255
, the loop will run exactly 1, 2 or 3 times, and the function can be optimized by unrolling the loop manually:which then simplifies to:
LLVM is smart enough to do this optimization if the base is 16, but not for any other base