llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.99k stars 11.95k forks source link

Over-aggressively optimization on infinite loops #60622

Open FrankHB opened 1 year ago

FrankHB commented 1 year ago

Case

// a.cc
#include <iostream>

int main()
{
    while(true)
        ; 
}

void unreachable()
{
    std::cout << "Hello world!" << std::endl;
}

Reproduction

clang++ -O1 a.cc
./a.out

This will show "Hello world!".

Tested on Godbolt. This issue is reproducible with x86-64 Clang++ since 13.0.0.

Discussions

It might be well-known that ISO C++ permit an implementation to remove infinite loop without any further reasoning and proof of the termination under the abstraction machine semantics. However, there is no wording about assuming it having undefined behavior, so the transformation is limited. The empty loop can be removed, but it should not introduce any differences on observable behavior besides the removal of the loop and such difference (if any; not here because there is no code after the loop) should be considered unspecifed but not undefined. In this case, it is valid to transform the translation unit as if it consists of int main(){}, but not with the call to unreachable.

Related

A similar issue #60419 was reported. Besides not that "minimal" in its case, the root cause seems the same. However, there are some other subtleties.

The reply for that issue is technically incorrect. First, it refers to a proposal of WG14, not the standard or any of its draft; second, it is about C, but C and C++ are still different here, even the original paper (WG14 N1528) do have concerns on liaison to WG21.

C and C++ are different in the meta level here. In ISO C, if something the language rules should have specified is actually underspecified, it is undefined, as per Clause 4 (same in many editions):

If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a constraint or runtime-constraint is violated, the behavior is undefined. Undefined behavior is otherwise indicated in this International Standard by the words ‘‘undefined behavior’’ or by the omission of any explicit definition of behavior. There is no difference in emphasis among these three; they all describe ‘‘behavior that is undefined’’.

As WG14 N1528 has clarified, the permission of removal the empty loops is granted, any further side effects of the unexpected change of the transfomation should not be depend on. Otherwise, it is undefined by "omission of any explicit definition or behavior" in mind. Despite the belief in the recommendation section of WG14 N1528, it is still confusing and potentionlly debatable, especially when compared to ISO C++ rules. Note the transformation on the original case in WG14 N1509 as well as this case is permitted, but it does not have to rely any undefined behavior in C, because the remained program does not contribute to any observable behaviors.

There is no such equivalent rules in ISO C++. Moreover, it should be clear the meaning of "may" in the C++ rule is clear as per Table 5 in ISO/IEC directives, part 2. That is, there is nothing to do with voiding the other requirements on a conforming implementation. As a result, the compiler cannot assume anything as it is undefined. Hence, it is a bug of Clang++.

60419 should not be marked as "invalid". And it can be marked "duplicate" in the sense that the case in that issue is also well-formed in C++.

Currently G++ does not have the same issue. It simply does not do the transformation. This is about missing-of-optimization, but not conformance.

inclyc commented 1 year ago

The problem here may not related to "infinite loops" at all. Just replace the loop to "__builtin_unreachable();" you can see the same result.

#include <iostream>

int main()
{
    __builtin_unreachable();
}

void unreachable()
{
    std::cout << "Hello world!" << std::endl;
}

In this case, the binary GCC-generated also prints "Hello world!". So we now have two problems:

  1. Is the IR transformation correct? infinite loop -> __builtin_unreachable()
  2. As for unreachable IR, is the generated code correct?

  1. Is the IR transformation correct? infinite loop -> __builtin_unreachable()

I think it is legal because that's how optimizing compilers work. However, GCC does not do the same thing as LLVM IR optimizer.

  1. As for unreachable IR, is the generated code correct?

I'm not sure about this. But looks like GCC agreed with us. https://gcc.godbolt.org/z/Whfr7fv4a. We leave an empty label "main:" and instructions defined in "void unreachable()" executed.

Fare9 commented 1 year ago

I have tried to emulate this optimization with opt, but for what I've seen it doesn't happen applying the optimizations in that way:

$ clang++ -S -emit-llvm test_infinite_loop.cpp -o test_infinite_loop.ll
$ opt -passes='default<O1>' test_infinite_loop.ll -o test_infinite_loop2.bc
$ llc test_infinite_loop2.bc -o test_infinite_loop2.s
$ clang++ -no-pie test_infinite_loop2.s -o test_infinite_loop2
$ ./test_infinite_loop2  
^C # stop the infinite loop

$ opt --O1 test_infinite_loop.ll -o test_infinite_loop2.bc
$ llc test_infinite_loop2.bc -o test_infinite_loop2.s
$ clang++ -no-pie test_infinite_loop2.s -o test_infinite_loop2
$ ./test_infinite_loop2 
^C # stop the infinite loop

Could be the order that optimization passes are applied in Clang++ when compiling with -O1?

gregbedwell commented 1 year ago

It's worth mentioning that there is a mechanism in the compiler to trap in these situations. I remember that we enabled it for PS4 in f81836bd18be6fb8bc6fe9942053f418dbb13d98 in addition to when it was previously just enabled on Win64. I'm not sure of the arguments for or against enabling it more generally, but if you specify the PS4 triple you'll see

main:                                   # @main
        ud2
unreachable():                       # @unreachable()
        ...
inclyc commented 1 year ago

I have tried to emulate this optimization with opt, but for what I've seen it doesn't happen applying the optimizations in that way:

opt just simplify the loop into IR unreachable. The behavior here may introduced by blank-label and a fall-through path for Machine-spec instructions.

main:
unreachable:

They are the same label and "unreachable" is the fall-through path for "main".

shafik commented 1 year ago

Note recent bug report: https://github.com/llvm/llvm-project/issues/60588 and related bug report although in this case it was signed overflow being replaced by unreachable: https://github.com/llvm/llvm-project/issues/48943#issuecomment-981039958

inclyc commented 1 year ago

Note recent bug report: #60588 and related bug report although in this case it was signed overflow being replaced by unreachable: #48943 (comment)

I think this issue duplicates to #48943 due to unreachables "fallthrough" next function. The transformation (loop -> unreachable) is correct.

FrankHB commented 1 year ago

The problem here may not related to "infinite loops" at all. Just replace the loop to "__builtin_unreachable();" you can see the same result.

It is. The core problem is that infinite loops are not the same to evaluation __builtin_unreachable();, given that the latter is intended as the violation of [[noreturn]] which has undefined behavior definitely.

#include <iostream>

int main()
{
    __builtin_unreachable();
}

void unreachable()
{
    std::cout << "Hello world!" << std::endl;
}

In this case, the binary GCC-generated also prints "Hello world!". So we now have two problems:

  1. Is the IR transformation correct? infinite loop -> __builtin_unreachable()
  2. As for unreachable IR, is the generated code correct?
  1. Is the IR transformation correct? infinite loop -> __builtin_unreachable()

I think it is legal because that's how optimizing compilers work. However, GCC does not do the same thing as LLVM IR optimizer.

It isn't. They are different cases.

  1. As for unreachable IR, is the generated code correct?

I'm not sure about this. But looks like GCC agreed with us. https://gcc.godbolt.org/z/Whfr7fv4a. We leave an empty label "main:" and instructions defined in "void unreachable()" executed.

This is not suspicous to me.

inclyc commented 1 year ago

I think it is legal because that's how optimizing compilers work. However, GCC does not do the same thing as LLVM IR optimizer.

It isn't. They are different cases.

Dead loops without side-effects are UB. Why couldn't we leave unreachable here?

FrankHB commented 1 year ago

I think it is legal because that's how optimizing compilers work. However, GCC does not do the same thing as LLVM IR optimizer.

It isn't. They are different cases.

Dead loops without side-effects are UB. Why couldn't we leave unreachable here?Teh

There are no such rules in the standard AFAIK.

FrankHB commented 1 year ago

It's worth mentioning that there is a mechanism in the compiler to trap in these situations. I remember that we enabled it for PS4 in f81836b in addition to when it was previously just enabled on Win64. I'm not sure of the arguments for or against enabling it more generally, but if you specify the PS4 triple you'll see

main:                                   # @main
        ud2
unreachable():                       # @unreachable()
        ...

You may specify -mllvm -trap-unreachable. The generated code is similar.

This does not address the issue that the loop should not be just transformed to unreachable blindly, though.

inclyc commented 1 year ago

This does not address the issue that the loop should not be just transformed to unreachable blindly, though.

@shafik Do you think it is legal to transform the loop here according to standards?

shafik commented 1 year ago

This does not address the issue that the loop should not be just transformed to unreachable blindly, though.

@shafik Do you think it is legal to transform the loop here according to standards?

I believe there is intent to address this through a paper or DR: https://twitter.com/jfbastien/status/1623465321997963264

I believe the prevailing sentiment was this was a valid optimization for C++ but this could change upon review by Core and or EWG.

CC @jfbastien

kerzol commented 1 year ago

The following code segfaults.

// compile with
// clang++ -Wall -O1 test.cpp -o test
// Tested with clang version 14.0.6

#include <iostream>

int main() {
  std::cout << "Why segfault?" << std::endl;
  while(1)
   ;
  std::cout << "Never ici." << std::endl;
}

void reachable() {
  std::cout << "Hello World" << std::endl;
}

Output:

Why segfault?
Hello World
[1]    194766 segmentation fault (core dumped)  ./test
inclyc commented 1 year ago

Recently many new issues filed here duplicated. Lets talk about this problem in #48943.

60637

60622

60588

48943

FrankHB commented 1 year ago

Recently many new issues filed here duplicated. Lets talk about this problem in #48943.

60637 #60622 #60588 #48943

Probably better to discuss there, but I disagree to treat this issue duplicate before a CWG defect report is filed with a proposed resolution to normatively treat this UB.

At least for now, this case is still revealing a conformance bug specific to the frontend. Neither of the remaining is focusing on this point.

pbo-linaro commented 1 year ago

Indeed, the problem is that the loop gets optimized out, and the main function does not have a ret instruction that was generated, thus leaving an empty body, and falling in next symbol.

Regarding the loop being optimized out, bisect found it started from https://github.com/llvm/llvm-project/commit/6c3129549374c0e81e28fd0a21e96f8087b63a78.

frederick-vs-ja commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following: ...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

I reported cplusplus/draft#5377, and the current CWG Chair replied me without processing it furtherly.

FrankHB commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following: ...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

I reported cplusplus/draft#5377, and the current CWG Chair replied me without processing it furtherly.

Violation of a permission does not imply undefined behavior, at least with current rules of ISO C++. Let's discuss the issue of the standard there.

inclyc commented 1 year ago

Let's discuss the issue of the standard there.

Fine. I reopened the issue to raise some attention. Let's discuss about the optimization in #48943, and the standard changes in this issue.

inclyc commented 1 year ago

CC @AaronBallman @erichkeane

AaronBallman commented 1 year ago

I think it is legal because that's how optimizing compilers work. However, GCC does not do the same thing as LLVM IR optimizer.

It isn't. They are different cases.

Dead loops without side-effects are UB. Why couldn't we leave unreachable here?Teh

There are no such rules in the standard AFAIK.

C2x 6.8.5p5:

An iteration statement may be assumed by the implementation to terminate if its controlling expression is not a constant expression198), and none of the following operations are performed in its body, controlling expression or (in the case of a for statement) its expression-3199): — input/output operations — accessing a volatile object — synchronization or atomic operations. 198)An omitted controlling expression is replaced by a nonzero constant, which is a constant expression. 199)This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.

C++2b [intro.progress]p7:

For a thread of execution providing concurrent forward progress guarantees, the implementation ensures that the thread will eventually make progress for as long as it has not terminated. [Note 5: This is required regardless of whether or not other threads of execution (if any) have been or are making progress. To eventually fulfill this requirement means that this will happen in an unspecified but finite amount of time. — end note]

and p8:

It is implementation-defined whether the implementation-created thread of execution that executes main ([basic.start.main]) and the threads of execution created by std​::​thread ([thread.thread.class]) or std​::​jthread ([thread.jthread.class]) provide concurrent forward progress guarantees. General-purpose implementations should provide these guarantees.

inmaldrerah commented 1 year ago

C2x 6.8.5p5:

An iteration statement may be assumed by the implementation to terminate if its controlling expression is not a constant expression198), and none of the following operations are performed in its body, controlling expression or (in the case of a for statement) its expression-3199):

But true is a constant expression, so not by this rule at least. Also, this issue focuses on C++, not C.

For the C++ part, I think @FrankHB has already mentioned it and discussed why it does not mean an undefined can be generated in the "Discussions" and "Related" part.

AaronBallman commented 1 year ago

But true is a constant expression, so not by this rule at least.

Agreed, but note the C footnote about the expectation of being able to remove empty loops even if you can't prove they terminate. This is intended to point out that the as-if rule applies (which is really what I should have quoted anyway, sorry for the confusion):

C2x 5.1.2.3p4:

In the abstract machine, all expressions are evaluated as specified by the semantics. An actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no needed side effects are produced (including any caused by calling a function or through volatile access to an object).

An empty loop has no needed side effects, so it's fine to remove under as-if.

Also, this issue focuses on C++, not C.

C++ also has the as-if rule, C++2b [intro.abstract]p1:

The semantic descriptions in this document define a parameterized nondeterministic abstract machine. This document places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below. 7) 7) This provision is sometimes called the “as-if” rule, because an implementation is free to disregard any requirement of this document as long as the result is as if the requirement had been obeyed, as far as can be determined from the observable behavior of the program. For instance, an actual implementation need not evaluate part of an expression if it can deduce that its value is not used and that no side effects affecting the observable behavior of the program are produced.

We are definitely allowed to make the transformation, but my feeling is "just because we can does not mean we should." I had mentioned on one of the other duplicated reports: IMO, we should never optimize code such that it results in execution flowing off the end of a function unless someone has concrete performance numbers justifying the shocking behaviors that come out of the optimization. It may be a legal transformation, but it also breaks expectations that are not entirely unreasonable.

kerzol commented 1 year ago

Any optimization should not transform a code that works to something that segfaults, it is crazy !

erichkeane commented 1 year ago

Any optimization should not transform a code that works to something that segfaults, it is crazy !

That is about 99% of what the optimizer does, is transition undefined behavior to a segfault. Consider:

 void f(int i, int* j) {
       j->foo = 100/i;     
   }

OPT/codegen assumes that 'i' is not zero (as dividing with it is UB), and that 'j' is non-null (as dereferencing it is also null). the code we 'transform' this to segfaults, because the person did something that is UB.

I take [intro.progress]p1 to say that infinite loops are UB. We can assume that any thread is going to terminate. Thus, if we find some code that would cause us to NOT terminate (such as a while(1) loop), it is clearly not unreachable.

inmaldrerah commented 1 year ago

I take [intro.progress]p1 to say that infinite loops are UB.

I don't think so. [intro.progress]p1 says that you may (are permitted to) assume that every loop will eventually terminate, but it does not comment on whether the code shall be so, i.e. you may assume so, but the code that does not do so is not (by this rule) undefined behavior. Meanwhile, the standard doesn't say what the implementation should do if the assumption does not hold for the code, so it is an unspecified behavior.

inmaldrerah commented 1 year ago

We are definitely allowed to make the transformation

Yes. As FrankHB stated, it is valid to remove the loop (as if the code consists of int main() {}. However, as long as having an infinite loop is not an undefined behavior, other restrictions still applies and we cannot put the unreachable there. Putting an unreachable means that we believe the loop is not going to terminate, while removing the loop means that we assume the loop must eventually terminate. We may choose either one of the two, but they shouldn't be allowed to happen together, otherwise it would mean that we believe the loop is both going to terminate eventually and, at the same time, not going to terminate at all.

AaronBallman commented 1 year ago

Yes. As FrankHB stated, it is valid to remove the loop (as if the code consists of int main() {}. However, as long as having an infinite loop is not an undefined behavior, other restrictions still applies and we cannot put the unreachable there.

I've always believed that violating "may be assumed" is UB by omission. We can assume all loops terminate. If a loop doesn't terminate, it's UB because the standard doesn't say what should happen instead -- it tells us explicitly we can assume all loops terminate (or perform side effects/etc).

We may choose either one of the two, but they shouldn't be allowed to happen together, otherwise it would mean that we believe the loop is both going to terminate eventually and, at the same time, not going to terminate at all.

That seems "reasonable" when you view UB as removing all behavioral guarantees from the program; it can lead to nonsense like that.

(Again, this is not me arguing that we should make this optimization. I'm only stating that I'm not yet convinced there's a conformance issue.)

FrankHB commented 1 year ago

Yes. As FrankHB stated, it is valid to remove the loop (as if the code consists of int main() {}. However, as long as having an infinite loop is not an undefined behavior, other restrictions still applies and we cannot put the unreachable there.

I've always believed that violating "may be assumed" is UB by omission. We can assume all loops terminate. If a loop doesn't terminate, it's UB because the standard doesn't say what should happen instead -- it tells us explicitly we can assume all loops terminate (or perform side effects/etc).

Technically, no, at least in ISO C++. The definition of UB also does not help.

We may choose either one of the two, but they shouldn't be allowed to happen together, otherwise it would mean that we believe the loop is both going to terminate eventually and, at the same time, not going to terminate at all.

That seems "reasonable" when you view UB as removing all behavioral guarantees from the program; it can lead to nonsense like that.

(Again, this is not me arguing that we should make this optimization. I'm only stating that I'm not yet convinced there's a conformance issue.)

It is also questionable, though it is another problem.

lhmouse commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following:

...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

@frederick-vs-ja That's an ignorant interpretation. The program does not violate anything, because the thread executing an infinite loop may access violative objects in signal handlers.

frederick-vs-ja commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following:

...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

@frederick-vs-ja That's an ignorant interpretation. The program does not violate anything, because the thread executing an infinite loop may access violative objects in signal handlers.

Signal handling is not mentioned in [intro.progress], so I don't think your interpretation is valid here.

lhmouse commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following:

...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

@frederick-vs-ja That's an ignorant interpretation. The program does not violate anything, because the thread executing an infinite loop may access violative objects in signal handlers.

Signal handling is not mentioned in [intro.progress], so I don't think your interpretation is valid here.

Signed integers are not mentioned in [intro.progress] either. Does it mean that the signness of integers is not valid?

frederick-vs-ja commented 1 year ago

The normative wording says

The implementation may assume that any thread will eventually do one of the following:

...

This program violates such a permitted assumption. I don't know which requirements the standard still imposes.

@frederick-vs-ja That's an ignorant interpretation. The program does not violate anything, because the thread executing an infinite loop may access violative objects in signal handlers.

Signal handling is not mentioned in [intro.progress], so I don't think your interpretation is valid here.

Signed integers are not mentioned in [intro.progress] either. Does it mean that the signness of integers is not valid?

No. But that is already mentioned elsewhere. [intro.progress] doesn't say signal handling is something that shoudn't be optimized out.

And which signal handler can be involved in this example?

Reverier-Xu commented 1 year ago

In @kerzol's case it will be translated to this (I changed iostream to cstdio which have less noises in asm result):

;int main (int argc, char **argv, char **envp);
push    rax
lea     rdi, str.Why_segfault      ; 0x2004 ; "Why segfault?" ; const char *format
xor     eax, eax
call    sym.imp.printf             ; int printf(const char *format)
nop
;sym.reachable ();
lea     rdi, str.Hello_World       ; 0x2012 ; "Hello World" ; reachable()
xor     eax, eax
jmp     sym.imp.printf
add     byte [rax], al
  ;-- section..fini:
;sym._fini ();
...

When I removed the infinite loop, the result was this:

;int main (int argc, char **argv, char **envp);
push    rax
lea     rdi, str.Why_segfault      ; 0x2004 ; "Why segfault?" ; const char *format
xor     eax, eax
call    sym.imp.printf             ; int printf(const char *format)
lea     rdi, str.Never_ici.        ; 0x2012 ; "Never ici." ; const char *format
xor     eax, eax
call    sym.imp.printf             ; int printf(const char *format)
xor     eax, eax
pop     rcx
ret
nop     word cs:[rax + rax]
;sym.reachable ();
lea     rdi, str.Hello_World       ; 0x201d ; "Hello World" ; reachable()
xor     eax, eax
jmp     sym.imp.printf
add     byte [rax], al
  ;-- section..fini:
;sym._fini ();
...

Apparently, the compiler not only removes the infinite loop with all the code behind the loop, it also removes the stack pointer reduction operation at the end of the function, which caused the segfault. Is this reasonable? It is mentioned in the standard that the compiler can remove loops that have no observable behavior, but I think the compiler should not break the integrity of the function.

kerzol commented 1 year ago

Here is another code example with signal handlers

// Works and terminates: clang++ -O0 t.cpp -o t; ./t
// Segfaults: clang++ -O1 t.cpp -o t; ./t

#include <iostream>
#include <csignal>
#include <thread>

void catchme ( int signum ) {
   printf ("Signal received !\n");
   exit(3);
}

void stp ()
{
  sleep(1);
  printf ("fint!\n");
  kill(getpid(), SIGUSR1);
}

int main () {
  signal(SIGUSR1, catchme);  
  std::thread t(stp);
  while(1) {}
  t.join();
  return 0;
}

A couple of questions to keep the discussion going:

  1. Can the optimizer just throw a compiler-time error when encountering an infinite loop instead of run-time segfault ? Optimizer knows what it removes, isn't it ?

  2. C standard in the section "6.8.5 Iteration statements" says:

    An iteration statement may be assumed by the implementation to terminate if its controlling expression is not a constant expression and none of the following operations are performed in its body, controlling expression or (in the case of a for statement) its expression-3:

    • input/output operations
    • accessing a volatile object
    • synchronization or atomic operations.

    158) An omitted controlling expression is replaced by a nonzero constant, which is a constant expression. 159) This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.

    While C++ [intro.progress] says:

    The implementation may assume that any thread will eventually do one of the following: (1.1) terminate, (1.2) make a call to a library I/O function, (1.3) perform an access through a volatile glvalue, or (1.4) perform a synchronization operation or an atomic operation. [Note 1: This is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven. — end note]

    They sound for me not really compatible. Is it ok for you ? Why there is no mention of "constant expression" exception in C++ version ?

  3. Should we forbid infinite loops in c++ in order to offer forward progress guarantee by default ?

  4. Should any program with infinite loop be compiled directly into a single instruction "throw segfault" ?

  5. [intro.progress] says

    The implementation may assume that any thread will eventually do one of the following: (1.1) terminate, ....

    Can we consider that all real-world programs eventually terminate ?

Btw, an interesting blog post about infinite loops in C/C++ https://stefansf.de/post/non-termination-considered-harmful/ I liked the "transfinite iteration" passage :)

lhmouse commented 1 year ago

Here is another code example with signal handlers


// Works and terminates: clang++ -O0 t.cpp -o t; ./t
// Segfaults: clang++ -O1 t.cpp -o t; ./t

#include <iostream>
#include <csignal>
#include <thread>

void catchme ( int signum ) {
   printf ("Signal received !\n");
   exit(3);

This is not correct. Neither printf() nor exit() is async-signal-safe. You should call write() followed by _Exit() instead.

  1. [intro.progress] says

    The implementation may assume that any thread will eventually do one of the following: (1.1) terminate, ....

    Can we consider that all real-world programs eventually terminate ?

Do you think that a program, which you know will be terminated by a signal, either by the default handler or calling _Exit() inside a custom handler, is one that eventually terminates? Otherwise?

inclyc commented 1 year ago

In reply to @kerzol

Can the optimizer just throw a compiler-time error when encountering an infinite loop instead of run-time segfault ? Optimizer knows what it removes, isn't it

Try

-Rpass=loop-delete

https://gcc.godbolt.org/z/hjeET4bdM.

example.cpp:5:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]
    while(true)
    ^
Compiler returned: 0
kerzol commented 1 year ago

You should call write() followed by _Exit() instead.

well, even without any prints and exit replaced by _Exit the program segfaults when compiled with clang++ -O1

Do you think that a program, which you know will be terminated by a signal, either by the default handler or calling _Exit() inside a custom handler, is one that eventually terminates?

Why we cannot suppose that all programs will be eventually terminated by a signal or some external event? :)

Seriously, I do not think that compiler should silently force (concurrent) forward progress guarantees, especially in the cases where it considerably changes the original program conceived by the programmer. And if someone wrote an infinite loop, he does this with a clear purpose to make an infinite loop. If compiler changes this, removing the loop or producing nonsense code, doomed to segfault... I would say that it will be better to a produce a kind of compilation error not a code that segfaults.

If I'm not mistaken the exception "... is not a constant expression ..." has been added deliberately to C standard to allow such things as while(1), because they often occur in practice. Should we do this for C++ also?

kerzol commented 1 year ago

example.cpp:5:5: remark: Loop deleted because it is invariant [-Rpass=loop-delete]

It should be changed to "Not only loop deleted but other asm code of the function including ret itself." :)

lhmouse commented 1 year ago

Do you think that a program, which you know will be terminated by a signal, either by the default handler or calling _Exit() inside a custom handler, is one that eventually terminates?

Why we cannot suppose that all programs will be eventually terminated by a signal or some external event? :)

Because there ARE programs that never terminate, for example, an operating system kernel, and bare metal firmware. These are known as 'freestanding implementations', which usually have no concept about signals; but they do have to respond to interrupts, which are very close to external signals.

FrankHB commented 1 year ago
  1. C standard in the section "6.8.5 Iteration statements" says:

    An iteration statement may be assumed by the implementation to terminate if its controlling expression is not a constant expression and none of the following operations are performed in its body, controlling expression or (in the case of a for statement) its expression-3:

    • input/output operations
    • accessing a volatile object
    • synchronization or atomic operations.

    158) An omitted controlling expression is replaced by a nonzero constant, which is a constant expression. 159) This is intended to allow compiler transformations such as removal of empty loops even when termination cannot be proven.

    While C++ [intro.progress] says:

    The implementation may assume that any thread will eventually do one of the following: (1.1) terminate, (1.2) make a call to a library I/O function, (1.3) perform an access through a volatile glvalue, or (1.4) perform a synchronization operation or an atomic operation. [Note 1: This is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven. — end note]

    They sound for me not really compatible. Is it ok for you ?

ISO C's original wording is based from a previous version of C++0x's, which is never up-to-date since 2009.

The adoption of UK 77 makes it start to diverge.

The adoption of US 38 in WG21 N3225 makes it look totally differently.

Notice nothing of them treat the infinite loop UB.

Why there is no mention of "constant expression" exception in C++ version ?

This is obviously a compromise. See WG14 N1541 6.4.

It does not solve the root problem of unclearity.

  1. Should we forbid infinite loops in c++ in order to offer forward progress guarantee by default ?

Infinite loops cannot be effectively forbidden, since a program can perform something oberservable. Otherwise, it is just Turing-incomplete.

Interestingly, non-termination itself is a kind of side effects. I see no reason to specifically reject its existence in an impure language.

Or introducing and eliminating UB back and forth? Please, no. There are enough compatibility headaches by shortsighted changes, which should already been enough, even without UB.

  1. Should any program with infinite loop be compiled directly into a single instruction "throw segfault" ?

No.

  1. [intro.progress] says

    The implementation may assume that any thread will eventually do one of the following: (1.1) terminate, ....

    Can we consider that all real-world programs eventually terminate ?

It depends. How do you define real-world programs? Does a program terminate after the machine running it hibernated?

jnk0le commented 1 year ago

one more #56819

The biggest issue here is that this UB compiles in silently, even though it is already known to break code.

ie. this: https://godbolt.org/z/xbq818rq4 should fail to happily compile

adah1972 commented 1 year ago

The words in the standard are:

This is intended to allow compiler transformations such as removal of empty loops, even when termination cannot be proven.

So I believe it is valid transformation to remove empty loops (according to the current standard), though it is another thing whether we should do it. After removing a loop, the main function should continue to execute and return. It looks insane to me to fall through to the next function.

inmaldrerah said it well: you can assume it will terminate and get rid of the useless loop, or you can assume it won't terminate and mark it unreachable, but you cannot do both.

jnk0le commented 1 year ago

arises in clang and can lead developers to trigger a security vulnerability. Hence, we are reopening the issue.

@gbdngb12 is there a CVE related to this already? Or just the hypothetical scenarios?

It looks insane to me to fall through to the next function.

inmaldrerah said it well: you can assume it will terminate and get rid of the useless loop, or you can assume it won't terminate and mark it unreachable, but you cannot do both.

This is how c/c++ UBs work. Not "do the (architecturally) most obvious thing" but "do the biggest BS possible". Same case as the returning functions with missing explicit return statement. Those at least have a warning.

dwblaikie commented 1 year ago

(please ease up on the language ("BS", "insane", etc) here)

@AaronBallman @erichkeane is this bug going anywhere constructive? Or can/should we close it with your/our current understanding, and revisit the issue when there's changes to the spec?

erichkeane commented 1 year ago

Aaron should answer too, but IMO, what we are doing is permitted per the standard (albeit perhaps user-hostile), thus the language standards organizations are the correct place to hash this out.

I think we have 1 of 2 ways out of this, but the standards organizations need papers to change the standard as well.

1- We wait until the standards organizations make a decision, and change our implementation. 2- We change our implementation NOW in anticipation/in sympathy with users.

I lean towards 1 here, the examples we've seen so far are sufficiently contrived to get very opinionated folks heated up about it, but I haven't seen sufficient real-world code to make me lean towards 2.

So IMO if we choose 1, we should close this until the committees make a change, else we can leave this open to solicit real-world examples being affected by this or close until we get sufficient evidence.

dwblaikie commented 1 year ago

Yeah - for the loop part, I'm happy to (1), for the "UB causes fallthrough" part (tracked on another bug) I'd be happy to do (2) and switch to trap-unreachable by default/as the only option perhaps. I had planned to do that years ago, but ended up stuck in my own quagmire of trying to optimize it too much rather than using the existing behavior (that Sony and Apple use by default already - so it's likely fine/robustly exercised, etc). Mostly should require changing the default, fixing tests, and maybe turning it back on for some targets that don't have trap/have weird ways of doing traps (GPUs were something I tripped over when trying to do that work). I mention all that only because it might mitigate some of the frustration users experience when they read that (1) is the answer to this issue - it's a lot less drastic if (2) is the answer to the function-fallthrough situation. (it still leaves the "we deleted all the code after the inf loop, and the unconditional code before it too... " which is pretty rough - but I'm OK with that)

AaronBallman commented 1 year ago

Aaron should answer too, but IMO, what we are doing is permitted per the standard (albeit perhaps user-hostile), thus the language standards organizations are the correct place to hash this out.

I agree that I think this is permitted per spec, and I agree that it would be good to involve standards committees in fixing that. However, I think it's perfectly reasonable for us to improve this under QoI -- just because we can optimize some way doesn't mean we have to, and the union of these two optimization decisions has incredibly surprising behavior.

I think these examples are reduced to make it easier to reason about the issue, which gives the appearance of being contrived. I think it's still an issue that occurs naturally in the wild, just with really hard-to-debug symptoms. (I have nothing to back this thinking up beyond "I've seen some stuff in my years".)

So I'd prefer we change the behavior of "UB causes fallthrough" to instead be a trap where possible (which is hopefully everywhere, without significant loss of performance).

jnk0le commented 1 year ago

I think it's still an issue that occurs naturally in the wild, just with really hard-to-debug symptoms. (I have nothing to back this thinking up beyond "I've seen some stuff in my years".)

(1) generating a sound warning (like a returning function with missing return statement) would be enough IMO. Especially that the first thing during debug is to check compiler warnings and production codebases are protected with -Werror on CIs.

lhmouse commented 1 year ago

I wish I wouldn't have had to repeat myself:

A signal handler may modify a volatile object or terminate the program by calling std::_Exit(). As a consequence, the implementation must ensure that, a handler, which has been installed in a compliant way, shall be invoked eventually upon delivery of a corresponding signal.

Given an arbitrary loop, a signal may arrive:

  1. before the loop;
  2. within the loop;
  3. after the loop.

If the loop is proved to not have any observable behavior, then it is unsequenced with a signal handler. The compiler is allowed to delete the loop, provided the signal handler will be invoked eventually, so a signal may arrive

  1. before the deleted loop;
  2. not within the deleted loop as it no longer exists;
  3. after the deleted loop.

However, if the loop is obviously infinite, then all code after the loop will be unreachable, so a signal may arrive:

  1. before the infinite loop;
  2. within the infinite loop;
  3. but not after the infinite loop.

In this case it is invalid to delete the loop, as instead of proving that a signal handler will be invoked, the compiler transforms the program to undefined behavior. So it's a bug for you. Please fix it.