llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29k stars 11.95k forks source link

Wrong optimizations for pointers: `if (q == p) use p` -> `if (q == p) use q` #43658

Open llvmbot opened 4 years ago

llvmbot commented 4 years ago
Bugzilla Link 44313
Version trunk
OS Linux
Reporter LLVM Bugzilla Contributor
CC @dwblaikie,@efriedma-quic,@fhahn,@dobbelaj-snps,@nunoplopes,@pogo59

Extended Description

It seems clang's optimizer changes if (q == p) use p to if (q == p) use q. This is quite natural but unfortunately is wrong when p and q are pointers with different provenance (similar to -0. and +0. in floating point numbers).

Most examples of equal pointers differing in provenance involve a past-the-end pointer which usually leads discussions into wrong directions. For a change, below are two somewhat different examples.

Example with restrict:

#include <stdio.h>

static void g(int *p, int *q)
{
    if (q == p)
        *p = 2;
}

static int f(int *restrict p, int *restrict q)
{
    *p = 1;
    g(p, q);
    return *p;
}

int main()
{
    int x;
    printf("%d\n", f(&x, &x));
}
$ clang -std=c11 -Weverything test.c && ./a.out
2
$ clang -std=c11 -Weverything -O3 test.c && ./a.out
1

Example with dead allocated memory:

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

__attribute__((__optnone__)) // imagine it in a separate TU
static void expose(int *p)
{
    (void)p;
}

int main()
{
    char w[sizeof(int *)];

    int *q = malloc(sizeof(int));
    expose(q);
    memcpy(&w, &q, sizeof(int *));
    free(q);

    int *p = malloc(sizeof(int));
    expose(p);
    *p = 1;

    int been_there = 0;
    if (memcmp(&w, &p, sizeof(int *)) == 0) {
        been_there = 1;
        *p = 2;
    }
    printf("%d %d\n", been_there, *p);
}
$ clang -std=c11 -Weverything test.c && ./a.out
1 2
$ clang -std=c11 -Weverything -O3 test.c && ./a.out
1 1
clang x86-64 version: clang version 10.0.0 (https://github.com/llvm/llvm-project.git 755a66ebdeda38669f5498565cbc6af331b47bad)

Please note that these examples are designed not to involve any controversial parts of the C standard, e.g., all accesses are via p, there are no accesses via the "wrong" pointer q.

llvmbot commented 4 years ago

AFAICT the wrong optimization in the provided examples is applied in Global Value Numbering.

There is a variation of this bug: the optimizer sometimes changes if (p == q) use p to if (p == q) use q if it can track the provenance of q but not of p. This affectes gcc too and there are 3 testcases for it in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93051. With clang, all of them go wrong in Global Value Numbering too, so I guess it's enough to paste only one example here.

Example with a past-the-end pointer:

#include <stdio.h>

__attribute__((noipa,optnone)) // imagine it in a separate TU
static void *opaque(void *p) { return p; }

int main()
{
    int x[5];
    int y[1];

    int *p = x;
    int *q = y + 1;

    int *r = opaque(p); // hide provenance of p
    if (r == q) {
        *p = 1;
        *r = 2;
        printf("result: %d\n", *p);
    }

    opaque(q);
}
$ clang -std=c11 -Weverything -Wno-unknown-attributes test.c && ./a.out
result: 2
$ clang -std=c11 -Weverything -Wno-unknown-attributes -O3 test.c && ./a.out
result: 1
clang x86-64 version: clang version 10.0.0 (https://github.com/llvm/llvm-project.git fccac1ec16951e9a9811abf19e2c18be147854fc)

Based on the example from Harald van Dijk in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502#c4.

llvmbot commented 4 years ago

I've seen the bug 34548 but it's not entirely clear what it is about. Initial testcase is quite complex and involves, at least, the following issues:

So my idea was to file separate bugs about issues like the optimization p == q ? p : q -> q with various testcases:

OTOH I don't remember ever seeing the specific variant of application of conditional equivalence that is reported here (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502#c4 and https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502#c18 are close but not the same). So not sure how useful to link it with bug 34548. But if you prefer, e.g., to see everything related to conditional equivalence in the bug 34548 please let me know.

efriedma-quic commented 4 years ago

This is essentially bug 34548.

llvmbot commented 4 years ago

Probably it's better to add an example with past-the-end pointer too, for completeness.

#include <stdint.h>
#include <string.h>
#include <stdio.h>

__attribute__((__optnone__)) // imagine it in a separate TU
static void expose(int *p)
{
    (void)p;
}

static int been_there = 0;

static void f(int *p, int *q)
{
    // all variants work the same
    //if (memcmp(&q, &p, sizeof(int *)) == 0) {
    //if ((uintptr_t)(void *)q == (uintptr_t)(void *)p) {
    if (q == p) {
        been_there = 1;
        *p = 2;
    }
}

int main()
{
    int x[3];
    int y[1];

    int *p = x;
    int *q = y + 1;
    expose(p);
    expose(q);

    *p = 1;
    f(p, q);
    printf("%d %d\n", been_there, *p);
}
$ clang -std=c11 -Weverything test.c && ./a.out
1 2
$ clang -std=c11 -Weverything -O3 test.c && ./a.out
1 1
clang x86-64 version: clang version 10.0.0 (https://github.com/llvm/llvm-project.git c7492fbd4e85632a05428bd0281fcfd06f1fff6c)
llvmbot commented 4 years ago

IIUC in the restrict example, the call f(&x, &x); violates the terms of restrict so the program has undefined behavior.

Perhaps something like that was intended at some point but I don't see it in the text of the standard. Whether two restricted pointers could be equal or not depends on actual accesses. Even better, Example 3 in the formal definition of restrict (C11, 6.7.3.1p10) specifically illustrates the case of two equal restricted pointers.

pogo59 commented 4 years ago

IIUC in the restrict example, the call f(&x, &x); violates the terms of restrict so the program has undefined behavior.

llvmbot commented 4 years ago

Not sure about C, but at least in C++ I /think/ comparisons with dead pointers like this are invalid.

Even with memcmp? Could you please elaborate?

dwblaikie commented 4 years ago

Not sure about C, but at least in C++ I /think/ comparisons with dead pointers like this are invalid.