Open msinilo opened 3 months ago
It appears that before CodeGenPrepare we had
%3 = tail call range(i64 0, 65) i64 @llvm.ctpop.i64(i64 %1), !dbg !60
%4 = icmp ult i64 %3, 2, !dbg !60
%4 was used by a branch in its basic block and another basic block.
CodeGenPrepare duplicated the branch into the other basic block causing the ctpop to become liveout of its basic block. Because the full ctpop+icmp pattern was not visible in the second basic block the ctpop gets fully expanded.
CC: @RKSimon
Best illustrated by this Compiler Explorer example: https://gcc.godbolt.org/z/qGzWo39b6 (original repro was using libcxx unordered_map::find). x&(x-1) might be converted to popcnt which works OK if compiled for SSE4, but if not, rather than just doing LEA/TEST it tries to implement it using bit twiddling hacks.