Open avg-I opened 6 years ago
Maybe this can be fixed in reassociate or instcombine.
There are two zext getting in the way:
define dso_local i32 @combine(unsigned short, unsigned short)(i16 zeroext, i16 zeroext) local_unnamed_addr #0 { %3 = zext i16 %0 to i32 %4 = or i32 %3, 1515847680 %5 = xor i16 %1, 23130 %6 = zext i16 %5 to i32 %7 = shl nuw i32 %6, 16 %8 = xor i32 %7, %4 ret i32 %8 }
If you change the function argument to be:
uint32_t combine(uint32_t x, uint32_t y)
it produces optimal code, I think:
define dso_local i32 @combine(unsigned int, unsigned int)(i32, i32) local_unnamed_addr #0 { %3 = shl i32 %1, 16 %4 = and i32 %0, 65535 %5 = or i32 %4, %3 ret i32 %5 }
or equivalently, in asm:
combine(unsigned int, unsigned int): # @combine(unsigned int, unsigned int) shl esi, 16 movzx eax, di or eax, esi ret
Extended Description
It appears that in some code sequences Clang fails to recognize that x xor C xor C == x where C is a compile-time constant.
For instance:
uint32_t combine(uint16_t x, uint16_t y) { uint32_t r = 0x5a5a7777;
}
In this code the initial value of r is not important and the code should be equivalent to:
uint32_t combine(uint16_t x, uint16_t y) { return (x | ((uint32_t)y << 16)); }
But even with -O3 Clang produces the following assembler code:
It's easy to see that 0x5A5A will always cancel out through double application via xor, but Clang does not recognize that.