chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.78k stars 420 forks source link

LLVM vs C backend inconsistencies #22288

Open jabraham17 opened 1 year ago

jabraham17 commented 1 year ago

While investigating #22250, I found some differences in the code generation for our C and LLVM backends that may cause issues in extreme cases. This was further expanded on in #22274. I will summarize it here.

In C, there are a number of cases where integer promotion may occur. Integer promotion is the process of converting a smaller integral type (like short) to int. For a number of binary operators (arithmetic, multiplicative, and bitwise some), integer promotion will always occur. This is specified in the C standard (C99 6.3.1.8).

For example, this function:

char foo(char a, char b) {
  return a + b;
}

is actually

char foo(char a, char b) {
  return (char)((int)a + (int)b);
}

Using clang, this will be converted to LLVM IR

define signext i8 @foo(i8 noundef signext %0, i8 noundef signext %1) {
  %3 = sext i8 %0 to i32
  %4 = sext i8 %1 to i32
  %5 = add nsw i32 %3, %4
  %6 = trunc i32 %5 to i8
  ret i8 %6
}

Under optimization, this goes away as the compiler realizes these sext and trunc are not needed to preserve the original semantics. But for certain cases, like srem, they are needed to preserve the original semantics and prevent undefined behavior (see the comments in the linked posts about this).

Ultimately, this means that the Chapel C and LLVM backend is generating code that could mean different things. (a%b) is generated by Chapel using the C backend and srem a, b is generated by the LLVM backend, but these do not always map to the same assembly.

Is this an issue that concerns people? I think in most cases users will not run into issues because of this, but it is a curious result of using C

bradcray commented 1 year ago

I agree that this is a general concern, yes. Arguably one of the goals of our testing is to have them behave as consistently as possible, but that's only as good as the tests themselves. There's also the potential for there to be differences between different back-end C compilers (if we rely on areas of the language where the implementation has a choice of things), though hopefully those will be even more minimal and impactful.