TrustInSoft / tis-interpreter

An interpreter for finding subtle bugs in programs written in standard C
565 stars 28 forks source link

No warnings: treatment of bit-fields differ between compilers #123

Open ch3root opened 8 years ago

ch3root commented 8 years ago

Different compilers treat bit-fields both narrower and wider than int differently.

Source code:

#include <stdio.h>

int main()
{
  struct {
    int i : 3;
    unsigned long x : 33;
  } s = {0, 0};

  printf("%zu\n", sizeof ((void)0, s.i));
  printf("%lu\n", (unsigned long)(s.x - 1));
}

tis-interpreter (21f4c7a763b4601d723ea5749185c97115c9c98a) output:

[value] Analyzing a complete application starting at main
[value] Computing initial state
[value] Initial state computed

4

18446744073709551615

[value] done for function main

gcc (GCC) 7.0.0 20160707 (experimental):

$ gcc -std=c11 -pedantic -Wall -Wextra -O3 -fsanitize=undefined test.c && ./a.out
1
8589934591

clang version 3.9.0 (trunk 274757):

$ clang -std=c11 -Weverything -Wno-padded -O3 -fsanitize=undefined test.c && ./a.out
4
18446744073709551615
pascal-cuoq commented 8 years ago

GCC's behavior for the second printf looks like a bug to me(*). “bit-field of width 33” is not a type. The x86-64 platform does not have any integer type in which the result of 0 - 1 is 8589934591.

(*) or at least a seriously misguided extension: any type other than _Bool, int, signed int, or unsigned int when defining a bit-field means the program uses a compiler extension, I guess.

ch3root commented 8 years ago

As it turned out, the question is not that easy and has a long history. Have you seen these links (from Twitter): http://open-std.org/jtc1/sc22/wg14/www/docs/dr_315.htm, http://open-std.org/jtc1/sc22/wg14/www/docs/n1260.htm?

pascal-cuoq commented 8 years ago

I had read DR 315, and not realized its significance. The second link, which makes the issue clearer, I had not seen before.

mirabilos commented 8 years ago

Same compilers, even: GCC on, say, i386 treats a bitfield with 15 bits as 32 bits wide, GCC on m68k (exact same version) as 16 bits wide, due to implicit alignment assumptions. No small amount of fun trying to port crap like Webkit or GLib.

pascal-cuoq commented 8 years ago

tis-interpreter only tries to emulate one set of implementation-defined choices at a time. GCC's i386 and m68k targets clearly are different compilers and it is not a goal to emulate both at the same time. However, it seems possible to emulate both GCC and Clang targeting the same architecture, even if that means emitting some rare warnings for programs that venture into the areas where they do not coincide. Bit-fields defined over types other than int, signed int, unsigned int, or _Bool are one place where tis-interpreter can simply emit a warning and continue with Clang's choices. Enum members that do not fit the int type are another.