llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.34k stars 12.13k forks source link

[Clang] SIGSEGV with clang-trunk at -Os; Unexpected Behavior at -O1 vs Expected Infinite Loop at -O0 and gcc -O2 #117300

Open wangbo15 opened 1 week ago

wangbo15 commented 1 week ago

The following code triggers a SIGSEGV when compiled with clang-trunk and clang 19.1.0 using the -Os optimization level. With -O1, the program exits with a return value of 64. However, when compiled with clang -O0 or gcc -O2, the program behaves as expected and enters an infinite loop.

Additionally, the behavior cases are also different on the arm backends.

#include <memory> 
unsigned long long a;
void b(unsigned long long *, int) {}
class {
public:
  short c[10];
} e;
int main() {
  for (size_t d;;)
    b(&a, e.c[d]);
}

Please see: https://godbolt.org/z/79fzqePEv

nikic commented 6 days ago

This looks like the usual "side-effect free infinite loops are UB", but cc @AaronBallman as I'm not sure whether that applies to for (;;) as well or not.

AaronBallman commented 6 days ago

This is undefined behavior because the variable used to index the array is uninitialized, but let's pretend that's initialized so we're back to just looking at the loop.

https://eel.is/c++draft/basic.exec#intro.progress-1.6 says the program has to make forward progress if the infinite loop is trivial. https://eel.is/c++draft/stmt.iter.general#3 says this infinite loop is not trivial (because the loop has a non-empty body).

so I believe the loop is also UB.

wangbo15 commented 6 days ago

This is undefined behavior because the variable used to index the array is uninitialized, but let's pretend that's initialized so we're back to just looking at the loop.

https://eel.is/c++draft/basic.exec#intro.progress-1.6 says the program has to make forward progress if the infinite loop is trivial. https://eel.is/c++draft/stmt.iter.general#3 says this infinite loop is not trivial (because the loop has a non-empty body).

so I believe the loop is also UB.

Based on the standard you referenced, I believe this is not UB. If we modify the program as shown below, by transforming it into a non-trivial loop where the function b produces side effects, this change highlights differing behavior across optimization levels. For example, clang -O0, clang -Os, and clang -O1 exhibit varying behaviors in this scenario.

#include <memory> 
unsigned long long a;
unsigned long long g;

void b(unsigned long long *a, int b) {  g++; }
class {
public:
  short c[10];
} e;
int main() {
  for (size_t d = 0;;)
    b(&a, e.c[d]);
}
nikic commented 6 days ago

b does not produce "side effects" in the sense of the standard. You'd have to make g volatile for that (or similar).

wangbo15 commented 6 days ago

b does not produce "side effects" in the sense of the standard. You'd have to make g volatile for that (or similar).

If change to volatile unsigned long long g, the return values are same.

AaronBallman commented 6 days ago

That's insufficient to produce the side effects though. Here's a more complete example without UB: https://godbolt.org/z/fvvK4KhYc