Generated C code does not obey forward progress assumption

konsoletyper / teavm

Compiles Java bytecode to JavaScript, WebAssembly and C

https://teavm.org

Apache License 2.0

2.55k stars 260 forks source link

Generated C code does not obey forward progress assumption #885

Closed Maxdamantus closed 4 months ago

Maxdamantus commented 4 months ago

Example program:

public class CTest {
    public static void main(String[] argv) {
        System.out.println("search(3) = " + search(3));
        System.out.println("search(42) = " + search(42));
    }

    public static int search(int n) {
        int a = 0;
        while (a != n)
            a ^= 3;
        return a;
    }
}

TeaVM generates something like the following code corresponding with the while loop:

    while ((teavm_local_2 != teavm_local_1)) {
        teavm_local_2 = (teavm_local_2 ^ INT32_C(3));
    }

Invoking this loop using search(42) produces undefined behaviour according to C11 and later, which assumes that the loop terminates: https://www.iso-9899.info/n1570.html#6.8.5p6

According to Java semantics, the program should print search(3) = 3 then go into an infinite loop, but compiling using clang-14 -O2 prints both search(3) = 3 and search(42) = 42.

If the statement a ^= 3; is changed to a += 0; and the search(3) line is removed, the program instead segfaults since the compiler happens to omit the illegal infinite loop and the remainder of the main function, so control continues outside of the function.

Note that this used to be an issue for various compilers involving LLVM, including rustc: https://github.com/rust-lang/rust/issues/28728

In C this can be solved by generating loops such as for (;;) { if (!CONDITION) break; BODY } instead of while (CONDITION) { BODY }. Otherwise proof of termination or "side effects" would be required for loops with non-constant conditions.

Tested using clang 14.0.6 (x86_64-pc-linux-gnu) and TeaVM 0.9.2.

konsoletyper commented 4 months ago

That's strange spec and strange clang behaviour. I mean, I would expect such spec to allow compiler to remove infinite loop if it does not have side effect AND does not calculate values which are dependencies of further calculations. Missing the second part makes simply no sense. I mean, if it was

       public static int search(int n) {
        int a = 0;
                int b = 0;
        while (b != n)
            b ^= 3;
        return a;
    }

I could agree, that there's different behaviour in Java and C, but C behaviour at least expected and makes sense. But just throwing away a loop that has phi after it, which in turn used to produce result of a function is a nonsense! For me it looks like a flaw in the spec that clang authors use to create excessively aggressive optimizer.

BTW, issue is not reproducible with gcc.

konsoletyper commented 4 months ago

Tried with clang 17.0.6, and it produces segfault for xor operation as well. I'm not sure why, but I suspect it produces something like unreachable opcode in IR and then compiles it into native instruction that raises segfault. In this case I would say it's more expectable than simply printing wrong result.

Maxdamantus commented 4 months ago

For me it looks like a flaw in the spec that clang authors use to create excessively aggressive optimizer.

Indeed, personally I don't like that they added that to the spec, but as far as I can tell it kind of happened the other way around. clang and icc both had a tendency to miscompile infinite loops like that, so I think it was added to the spec in response (effectively specifying a bug imo).

Because of rustc (and maybe other compilers), the behaviour was eventually fixed in LLVM, so now the UB only occurs strictly according to the C specification (and eg, while (CONDITION) { BODY } and while (1) { if (!CONDITION) break; BODY } have different behaviours).

The clang developers have also made it so that when using -std=c99, the infinite loops behave as expected (this was not the case in previous versions, before the fixes in LLVM).

Maxdamantus commented 4 months ago

Tried with clang 17.0.6, and it produces segfault for xor operation as well. I'm not sure why, but I suspect it produces something like unreachable opcode in IR and then compiles it into native instruction that raises segfault. In this case I would say it's more expectable than simply printing wrong result.

I don't think it will try to emit an instruction that intentionally raises a segfault. I suspect the optimiser ultimately assumes that the infinite loop is unreachable (since executing such a loop would be UB anyway) and in turn considers the rest of the function (including an implicit return) to also be unreachable, so program execution will continue into some code that was not actually called, which could segfault for various reasons.

Here's a concise example of a program where an "uncalled" function seems to get executed as the result of UB:

#include <stdio.h>

int main(void) {
    int a = 0;
    while (a != 42)
        a ^= 3;
    return a;
}

int uncalled(int *p) {
    puts("oops");
    return *p;
}

$ clang-16 -O2 ub.c
$ ./a.out 
oops
Segmentation fault

If return 0; is used instead of return *p;, the program simply prints from the uncalled function and exits successfully.

tryone144 commented 4 months ago

I don't think it will try to emit an instruction that intentionally raises a segfault. I suspect the optimiser ultimately assumes that the infinite loop is unreachable (since executing such a loop would be UB anyway) and in turn considers the rest of the function (including an implicit return) to also be unreachable, so program execution will continue into some code that was not actually called, which could segfault for various reasons.

Inspecting the generated code for the example in godbolt you can see that the main function is considered unreachable. This emitted IR looks "fine" with clang versions before 15.0 (and gcc).

Maxdamantus commented 4 months ago

Inspecting the generated code for the example in godbolt you can see that the main function is considered unreachable.

Indeed, so the final output has no instructions generated (hence no intentional segfault instructions) for the main function. Since main is 0 bytes long, it ends up having the same address as uncalled, though confusingly godbolt only shows the uncalled label—the example would probably be clearer if it printed something before the invalid (assumedly unreachable) loop.

This can be seen easily using objdump -x -d a.out, since it shows the same address for main and uncalled in the symbol table, though like in godbolt it only shows one symbol as a label in the disassemby:

...
0000000000001140 g     F .text  0000000000000014              uncalled
0000000000004018 g       .bss   0000000000000000              __bss_start
0000000000001140 g     F .text  0000000000000000              main
...
0000000000001140 <uncalled>:
    1140:       53                      push   %rbx
...

Maxdamantus commented 4 months ago

FYI, after a bit of experimentation I've managed to come up with a program that breaks in gcc as well:

public class CTest {
    public static void main(String[] argv) {
        printInt(42);
        int counter = 0;
        for (;;) {
            if (counter >= 2)
                continue;
            printInt(counter++);
        }
    }

    public static void printInt(int i) {
        System.out.println(i);
    }
}

$ gcc-12 -O2 build/generated/teavm/c/all.c
$ ./a.out
42
0
1
Segmentation fault

Though in this case I think it's due to a bug in gcc, since the relevant loop generated here already uses while(1) { .. }, so the compiler is not allowed to assume termination (since that assumption can only be made for loops using "an iteration statement whose controlling expression is not a constant expression"). I'll probably try to report it there later, if it's not already a known bug.

Since gcc seems to be violating the specification here, I don't think there's a feasible workaround (afaik gcc in C mode at least has been trying to avoid making the forward progress assumptions that clang makes and that the C11 standard allows, so there's no alternative behaviour using eg, -std=c99).

~EDIT: actually, this probably isn't a gcc bug. This probably occurs due to the undefined behaviour of signed overflow in C. I guess this is something that could be addressed by TeaVM by ensuring that signed arithmetic operations are performed in such a way that is consistent with Java, where signed overflow is well-defined. It might be worth raising a separate issue for this, since it's somewhat distinct from the forward progress issue I've raised here.~

ANOTHER EDIT: actually, after thinking about this some more, the integer overflow never occurs semantically, so I think this is again a gcc bug.

YET ANOTHER EDIT: for future reference, in case anyone's interested: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114052