llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.11k stars 11.61k forks source link

Power-of-two checks use popcnt even if underlying architecture does not support it #94829

Open msinilo opened 3 months ago

msinilo commented 3 months ago

Best illustrated by this Compiler Explorer example: https://gcc.godbolt.org/z/qGzWo39b6 (original repro was using libcxx unordered_map::find). x&(x-1) might be converted to popcnt which works OK if compiled for SSE4, but if not, rather than just doing LEA/TEST it tries to implement it using bit twiddling hacks.

topperc commented 3 months ago

It appears that before CodeGenPrepare we had

  %3 = tail call range(i64 0, 65) i64 @llvm.ctpop.i64(i64 %1), !dbg !60          
  %4 = icmp ult i64 %3, 2, !dbg !60  

%4 was used by a branch in its basic block and another basic block.

CodeGenPrepare duplicated the branch into the other basic block causing the ctpop to become liveout of its basic block. Because the full ctpop+icmp pattern was not visible in the second basic block the ctpop gets fully expanded.

CC: @RKSimon