LLVM vs C backend inconsistencies

While investigating #22250, I found some differences in the code generation for our C and LLVM backends that may cause issues in extreme cases. This was further expanded on in #22274. I will summarize it here.

In C, there are a number of cases where integer promotion may occur. Integer promotion is the process of converting a smaller integral type (like short) to int. For a number of binary operators (arithmetic, multiplicative, and bitwise some), integer promotion will always occur. This is specified in the C standard (C99 6.3.1.8).

For example, this function:

char foo(char a, char b) {
  return a + b;
}

is actually

char foo(char a, char b) {
  return (char)((int)a + (int)b);
}

Using clang, this will be converted to LLVM IR

define signext i8 @foo(i8 noundef signext %0, i8 noundef signext %1) {
  %3 = sext i8 %0 to i32
  %4 = sext i8 %1 to i32
  %5 = add nsw i32 %3, %4
  %6 = trunc i32 %5 to i8
  ret i8 %6
}

Under optimization, this goes away as the compiler realizes these sext and trunc are not needed to preserve the original semantics. But for certain cases, like srem, they are needed to preserve the original semantics and prevent undefined behavior (see the comments in the linked posts about this).

Ultimately, this means that the Chapel C and LLVM backend is generating code that could mean different things. (a%b) is generated by Chapel using the C backend and srem a, b is generated by the LLVM backend, but these do not always map to the same assembly.

Is this an issue that concerns people? I think in most cases users will not run into issues because of this, but it is a curious result of using C

chapel-lang / chapel

LLVM vs C backend inconsistencies #22288