dimitriv / Ziria

A domain-specific-language and compiler for low-level bitstream processing.
92 stars 18 forks source link

LUT generation bug #123

Closed ghost closed 8 years ago

ghost commented 8 years ago

Running make testRX6Mbps.perf in the performance tests with --autolut yields the following C code:

void clut_gen582()

{
    for (unsigned long _lidx = 0; _lidx < 256; _lidx++) {
        uint8 idx581 = (uint8) _lidx;
        calign unsigned char scrmbl_st583[1] = {0};
        calign Bit x584 = 0;
        uint8 orig_idx585;
        uint8 dbg_idx586 = 0;
...

        orig_idx585 = idx581;
        bitArrWrite((BitArrPtr) &orig_idx585, 0, 7, (BitArrPtr) scrmbl_st583);
...
        dbg_idx586 |= (uint8) (*(uint8*) &((BitArrPtr) scrmbl_st583)[0] & 127);
        dbg_idx586 |= ((uint8) x584 & 1) << 7;
        if (idx581 != dbg_idx586) {
            printf("Fatal bug in LUT generation: packIdx/unpackIdx mismatch.\n");
            printf("Location: %s\n",
                   "blocks/../../transmitter/scramble.blk:33:5-14");
            exit(-1);
        }
...

This test does pass, but there's undefined behaviour here. Picking out the relevant details:

{
calign unsigned char scrmbl_st583[1] = {0};
uint16 orig_idx585 = 0; // _lidx
bitArrWrite((BitArrPtr) &orig_idx585, 0, 7, (BitArrPtr) scrmbl_st583);
}

bitArrWrite is trying to access memory from 0-7 while scrmbl_st583 can only be accessed 0-1. If the compiler chooses to optimize memory blocks 2-7, the behaviour would be undefined, most likely causing packIdx/unpackIdx mismatch.

@mainland This is the clang bug.

dimitriv commented 8 years ago

I am not sure I understand. bitArrWrite is bit-indexed and we are representing bit-arrays in packed form, that is scramble state is really a bit array of 7 bits, represented as a single byte (with the high-bit unused)

ghost commented 8 years ago

I think I misunderstood. I haven't looked exactly at the bit-shifts yet, which I plan to do soon. I have a simpler example which demonstrates this. Attaching required files to reproduce this bug. If this bug does occur, then packIdx/unpackIdx mismatch happens (only under clang with optimizations on).

I can force the correct behaviour by forcing clang to trace the bits for printing, twice. At an initial glance, I suspect pointer aliasing to be causing some undefined behaviour. But it might just be a clang bug.

Source code:

#include <cstdio>
#include <cstdlib>
#include <stdint.h>

typedef unsigned char Bit ;
typedef unsigned char * BitArrPtr ;

/* Effectively implements: tgt[vstart...(vstart+vlen-1)] := src 
 * Precondition: tgt has allocated (vlen+7)/8 bytes already 
 **************************************************************************************/
void bitArrWrite(BitArrPtr src, unsigned int vstart, unsigned int vlen, BitArrPtr tgt)
{
// copy from bit.c
}

void print_bits(uint16_t x)
{
      for(int i=8*sizeof(x)-1; i>=0; i--) {
          (x & (1 << i)) ? putchar('1') : putchar('0');
      }
      printf("\n");
}

void print(unsigned char * state) {
    printf("Bits; ");
    print_bits(state[0]);

// printing each bit twice apparently makes it magically work
#ifdef DOUBLE_PRINT
    printf("Double print; ");
    print_bits(state[0]);
#endif
}

int main() {
    for (unsigned long _lidx = 0; _lidx < 3; _lidx++) {
        unsigned char scrmbl_st[1] = {0};
        uint8_t orig_id = (uint8_t) _lidx;
        bitArrWrite((BitArrPtr) &orig_id, 0, 7, (BitArrPtr) scrmbl_st);
        orig_id = orig_id >> 7;
        print(scrmbl_st);
    }
}

Example output:

$ clang++ bitarrwrite.cpp -w -O3 -o bitarrwrite && ./bitarrwrite
Bits; 0000000000000000
Bits; 0000000000000000
Bits; 0000000000000000

$ clang++ bitarrwrite.cpp -w -D DOUBLE_PRINT -O3 -o bitarrwrite && ./bitarrwrite
Bits; 0000000000000000
Double print; 0000000000000000
Bits; 0000000000000001
Double print; 0000000000000001
Bits; 0000000000000010
Double print; 0000000000000010

$ clang++ bitarrwrite.cpp -w -O0 -o bitarrwrite && ./bitarrwrite
Bits; 0000000000000000
Bits; 0000000000000001
Bits; 0000000000000010

$ clang++ bitarrwrite.cpp -w -D DOUBLE_PRINT -O0 -o bitarrwrite && ./bitarrwrite
Bits; 0000000000000000
Double print; 0000000000000000
Bits; 0000000000000001
Double print; 0000000000000001
Bits; 0000000000000010
Double print; 0000000000000010

$ # I'm using nix-shell -p clang
$ clang -v
clang version 3.6.2 (tags/RELEASE_362/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Found candidate GCC installation: /nix/store/0pw39ww2z0hmr1y0idw4hqn0jflg908n-gcc-4.9.3/lib/gcc/x86_64-unknown-linux-gnu/4.9.3
Found candidate GCC installation: /nix/store/0pw39ww2z0hmr1y0idw4hqn0jflg908n-gcc-4.9.3/lib64/gcc/x86_64-unknown-linux-gnu/4.9.3
Selected GCC installation: /nix/store/0pw39ww2z0hmr1y0idw4hqn0jflg908n-gcc-4.9.3/lib64/gcc/x86_64-unknown-linux-gnu/4.9.3
Candidate multilib: .;@m64
Selected multilib: .;@m64
ghost commented 8 years ago

@dimitriv I think the precondition tgt has allocated (vlen+7)/8 bytes already might not be covering all cases. Maybe allowing an extra byte, by modifying it to tgt has allocated ((vlen+7)/8 + 1) bytes already would work?

mainland commented 8 years ago

Sid and I just looked at this, and bitArrWrite seems to be the culprit. The problem is that it assumes it can cast a BitArrPtr to a short pointer even when a bit array has only 1 byte of storage. A quick fix is to always allocate an even number of bytes for bit arrays.

Small note...we should be using types with explicit widths, e.g., uint16 instead of unsigned short.

ghost commented 8 years ago

Results of adding runtime checks with the clang sanitizer:

bit.c:128:16: runtime error: load of misaligned address 0x7ffcef5c8b6d for type 'unsigned short', which requires 2 byte alignment
0x7ffcef5c8b6d: note: pointer points here
 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  e0 0c fb bd a4 7f 00 00  d7
             ^ 
bit.c:128:4: runtime error: store to misaligned address 0x7ffcef5c8b6d for type 'unsigned short', which requires 2 byte alignment
0x7ffcef5c8b6d: note: pointer points here
 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  e0 0c fb bd a4 7f 00 00  d7
             ^ 
bit.c:129:8: runtime error: load of misaligned address 0x7ffcef5c8b4f for type 'unsigned short', which requires 2 byte alignment
0x7ffcef5c8b4f: note: pointer points here
 00 a4 7f 00 00  01 00 00 00 00 00 00 00  50 00 00 00 01 00 00 00  00 00 00 00 00 00 00 00  00 00 00
             ^ 
bit.c:89:15: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'unsigned int'
dimitriv commented 8 years ago

Yep does look serious ..

ghost commented 8 years ago

I've pushed my clang changes to master, which should help catching these types of minor quirks easier. There's a lot of stuff going on, such as renames and what not, but basically it just adds two compilers, clang and safe.

safe is theoretically (supposed to be) a compiler for an abstract machine based off of the C/C++ abstract machine; the only difference being that it includes runtime checks and crashes on noticing undefined behavior. Of course this is not perfect. Right now, it just uses clang with additional runtime checks. Hopefully there will be a better alternative in future. Also, given how hard it is to accurately detect undefined behavior, safe doesn't need to pass all the tests. It should however be a useful tool for debugging.

For reproducing this issue, you should be able to use:

$ # For actual performance tests
$ COMPILER=clang make testRX6Mbps.perf
$
$ # For debugging purposes
$ COMPILER=safe make testRX6Mbps.perf

I wouldn't recommend throwing print statements around (or using a step-through debugger) to debug clang since the clang frontend is doing some sort of lazy evaluation (probably by changing the scope of automatic variables) in the optimizer. I've noticed debugging clang directly does cause behavior to change.

The last time I checked, clang passed all tests, and does surprisingly good on the RXCCA performance test, but faces LUT bugs in the other performance tests.

ghost commented 8 years ago

Closed via https://github.com/dimitriv/Ziria/commit/35a86e5d324124ba044cc988258a96a5522ab3a4