davidgiven / ack

The Amsterdam Compiler Kit
http://tack.sf.net
Other
439 stars 63 forks source link

in C, "char *g = (char *)f + 11;" loses the 11 #38

Open kernigh opened 7 years ago

kernigh commented 7 years ago

The following program prints "pass" with gcc but "fail" with ack:

#include <stdio.h>
void f(void) {}
char *g = (char *)f + 11;

int main(void) {
    char *h = g - 11;
    if (h == (char *)f) {
        puts("pass");
        return 0;
    } else {
        puts("fail");
        return 1;
    }
}

In OpenBSD/amd64: ack -mlinuxppc -o cka ckpointer.c

In Debian Linux/powerpc:

$ gcc -o ckg ckpointer.c
$ ./cka
fail
$ ./ckg
pass

The bug is with char *g = (char *)f + 11; where g is a global variable and f is a function. Now gcc is correct, but ack silently loses the 11 and effectively does char *g = (char *)f;. After running ack -c.e ckpointer.c, I can see g defined as

g
 con $f

To get this bug, f must be a function, and I must cast it to another pointer type like char *. If I don't cast f, or if I cast f to an integer type, then ack gives an error for an illegal initializer.

davidgiven commented 7 years ago

I think that's not technically a bug --- casting a function pointer to an object pointer in C invokes undefined behaviour.

The EM spec has different encodings for reference-to-data-plus-offset and reference-to-procedure; there are different opcodes for loading them onto the EM stack (lae for data, lpi for procedures). Presumably this is to support architectures where pointer-to-data and pointer-to-function are different sizes.

Unfortunately the reference-to-procedure encoding doesn't have an offset field, so (char*)f+5 simply can't be expressed as a single EM value. It's still possible to calculate this in code, by loading the procedure reference and doing maths on it, but that can't be done in an initialiser.

I don't think this should be silently dropping the offset, however --- it's surprising behaviour. I'd expect an invalid initialiser error because from EM's point of view (char*)f+5 isn't constant.

kernigh commented 5 years ago

The code at ival.g check_ival() line 542 looks suspicious, because it ignores expr->VL_VALUE when the thing is a function:

            else    /* e.g., int f(); int p = f; */
            if (idf->id_def->df_type->tp_fund == FUNCTION)
                C_con_pnam(idf->id_text);
            else    /* e.g., int a; int *p = &a; */
                C_con_dnam(idf->id_text, expr->VL_VALUE);

Not sure if a new check for expr->VALUE != 0 would go here or somewhere else.