llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.47k stars 11.77k forks source link

Wrong code at -O1 on x86_64-linux_gnu since LLVM-13 #76789

Closed shao-hua-li closed 9 months ago

shao-hua-li commented 9 months ago

Clang at -O1 produces the wrong code since LLVM-13.

Bisected to 8f3d16905d75b07a933d01dc29677fe5867c1b3e, which was committed by @efriedma-quic

Compiler explorer: https://godbolt.org/z/35no4xxfK

% cat a.c
int printf(const char *, ...);
char a;
short b;
static short *c = &b;
static short **f = &c;
int g;
int h(char *j, long k) {
  int d = 0;
  char *e = j + k;
  for (; j < e; j++)
    d = (d << 4) + *j;
  return d;
}
int l(char j, long k) {
  int i = h(&j, k);
  return i;
}
int m(void);
void n() { m(); }
int m() {
  int o;
  char p = b = 4;
  for (;;) {
    g = 0;
    for (; g <= 4; g++) {
      p = 0;
      for (; p <= 5; p++)
        o = l(1, **f - 3);
      a = (6 || 0) & o;
    }
    break;
  }
  short ***s = &f, ***q = s;
  return &s != &q;
}
int main() {
  n();
  printf("%d\n", a);
}
%
% clang -O0 a.c && ./a.out
1
% clang -O1 a.c && ./a.out
0
%
nikic commented 9 months ago

This basically reduces to:

define i8 @test() {
  %a = alloca i8
  store i8 42, ptr %a
  %a.int = ptrtoint ptr %a to i64
  %a.int.1 = add i64 %a.int, 1
  %gep = getelementptr i8, ptr inttoptr (i64 -1 to ptr), i64 %a.int.1
  %load1 = load i8, ptr %gep
  store i8 1, ptr %a
  %load2 = load i8, ptr %gep
  %sub = sub i8 %load1, %load2
  ret i8 %sub
}

-passes=gvn optimizes this to 0. (It actually happens in LICM, but basically the same issue.)

It's probably due to this check: https://github.com/llvm/llvm-project/blob/54378a7c2fd7f0ed0a3ea7ef08bc24896700e2c5/llvm/lib/Analysis/BasicAliasAnalysis.cpp#L1507-L1510

I'm not sure whether this is a miscompile or not. It depends on the precise semantics of provenance around ptrtoint/inttoptr. It would be a miscompile under exposing ptrtoint + angelic inttoptr.

(I haven't checked where the inttoptr pattern comes from, possibly it shouldn't be produced in the first place.)

nikic commented 9 months ago

The inttoptr is introduced by InstCombine like this: https://llvm.godbolt.org/z/nYdKKo7Ez

I'm inclined to say that the bug is in BasicAA. I believe our current "model" is to be conservative with inttoptr and assume it can access all escaped objects.