morganstanley / hobbes

A language and an embedded JIT compiler
http://hobbes.readthedocs.io/
Apache License 2.0
1.16k stars 105 forks source link

WIP: build and test debug binaries #410

Closed smunix closed 3 years ago

smunix commented 3 years ago

do not merge

smunix commented 3 years ago

@mo-xiaoming - Evetntually, it looks like debug build tests have no specific issues on Ubuntu.

mo-xiaoming commented 3 years ago

@smunix I noticed that as well.

At first I tried to run github action on my own fork, nothing failed. Then I tried specific version combination (gcc9/llvm10) which failed on my own VM. And just to be sure, even ran ./result/bin/hobbes-test after nix build.

Eveything passed.

nix build uses different optimization level

Out of curiosity, I checked what enableDebugging does. Here is the compilation flags used by nix build. Got this by adding a objdump -g ./result/bin/hobbes-test command in build.yaml

 GNU C++14 9.3.0 -mtune=generic -march=x86-64 -ggdb -O2 -O2 -Og -std=gnu++14 -fstack-protector-strong -fno-strict-overflow -fPIC -frandom-seed=6qanrna2rw -fcommon --param ssp-buffer-size=4

Base on gcc manual,

If you use multiple -O options, with or without level numbers, the last such option is the one that is effective.

So -Og is used by nix build, and I was using -ggdb3 -O0 -fno-omit-frame-pointer to make it sanitizers friendly, and the unmodified CMakeLists.txt doesn't specify optimization level explicitly, so it uses -O0 on debug build as well

-O0 Reduce compilation time and make debugging produce the expected results. This is the default.

some experiment

They are all built by cmake on my ubuntu20.04 VM, with gcc9 and llvm-10-dev, installed by apt install

opt level -Og -O0 -O1 -O2 -O3 -Os -Ofast
test result :heavy_check_mark: :x: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

-O0 fails tests, github action passed because of different flags. If somehow, nix build uses -O0, I'm rather confident that it will fail as well

This result makes me nervous, gonna continue to investigate this problem

mo-xiaoming commented 3 years ago

Mystery solved,

EXPECT_TRUE(c.compileFn<bool()>("(1,2) matches (1,2)")());

c.compileFn<bool()>("(1,2) matches (1,2)")() returns a boolean with the value 255, then in EXPECT_TRUE, its value get tested, and a boolean value 255 doesn't behave like a boolean at all

Everything boils down to this code

If bool is not either 0 or 1, it is considered UB by both clang and gcc implementations, which can be confirmed by -fsanitize=undefined

However, clang always optimizes it away. And gcc on the other hand, has this "wrong" behavior all the way back to 4.X, it is only manifest itself with -O0 build

│   0x555555555184 <main+27>        movb   $0x0,-0xa(%rbp)              -> bool r = false                                                                                                                                        │
│   0x555555555188 <main+31>        movb   $0xff,-0x9(%rbp)             -> char b = 255                                                                                                                                       │
│   0x55555555518c <main+35>        movzbl -0x9(%rbp),%eax                                                                                                                                                        │
│   0x555555555190 <main+39>        mov    %al,-0xa(%rbp)               -> r = 255 :: bool                                                                                                                                     │
│   0x555555555193 <main+42>        movzbl -0xa(%rbp),%eax              -> %eax = 255                                                                                                                                        │
│   0x555555555197 <main+46>        test   %al,%al                                                                                                                                                                 │
│   0x555555555199 <main+48>        je     0x5555555551a7 <main+62>     -> if %al is zero, then not puts                                                                                                                                               │
│   0x55555555519b <main+50>        lea    0xe62(%rip),%rdi                                                                                                                            │
│   0x5555555551a2 <main+57>        callq  0x555555555060 <puts@plt>    -> %al is 255, is not zero, puts('r')                                                                                                                                              │
│   0x5555555551a7 <main+62>        movzbl -0xa(%rbp),%eax              -> %eax = 255                                                                                                                                     │
│   0x5555555551ab <main+66>        xor    $0x1,%eax                    -> %eax = 254                                                                                                                            │
│   0x5555555551ae <main+69>        test   %al,%al                                                                                                                                                                 │
│   0x5555555551b0 <main+71>        je     0x5555555551be <main+85>     -> if %al is zero, then no puts                                                                                                                                           │
│   0x5555555551b2 <main+73>        lea    0xe4d(%rip),%rdi                                                                                                                            │
│   0x5555555551b9 <main+80>        callq  0x555555555060 <puts@plt>    -> %al is 254, is not zero, puts('!r')

To make it well behave, CreateRet -> bool must return either 0 or 1, nothing else

WIP

smunix commented 3 years ago

To make it well behave, CreateRet -> bool must return either 0 or 1, nothing else

Noted, thanks! Can you fix this test? I suspect it's related, although it only manifested itself on LLVM post version 8.

mo-xiaoming commented 3 years ago

For failed test EXPECT_TRUE(c().compileFn<bool()>("(1,2) matches (1,2)")());, the generated IR is correct

define i1 @"/.t19639"() {
entry:
  %0 = load i32, i32* getelementptr inbounds (<{ i32, i32 }>, <{ i32, i32 }>* @6, i32 0, i32 0)
  %1 = load i32, i32* getelementptr inbounds (<{ i32, i32 }>, <{ i32, i32 }>* @6, i32 0, i32 1)
  %t = icmp eq i32 %0, 1
  %t1 = icmp eq i32 %1, 2
  %2 = select i1 %t, i1 %t1, i1 false
  %. = select i1 %2, i1 true, i1 false
  ret i1 %.
}

They are dump() from jitcc::getMachineCode, the return value is correctly set to i1

Must be something wrong after this stage

WIP

mo-xiaoming commented 3 years ago

For an hobbes expression (41,42) matches (41,42), it generates the IR code after optimization step

@0 = internal constant <{ i32, i32 }> <{ i32 41, i32 42 }>, align 1

define i1 @"/.t19709"() {
entry:
  %0 = load i32, i32* getelementptr inbounds (<{ i32, i32 }>, <{ i32, i32 }>* @0, i32 0, i32 0), align 4
  %1 = load i32, i32* getelementptr inbounds (<{ i32, i32 }>, <{ i32, i32 }>* @0, i32 0, i32 1), align 4
  %t = icmp eq i32 %0, 41
  %t1 = icmp eq i32 %1, 42
  %2 = select i1 %t, i1 %t1, i1 false
  %. = select i1 %2, i1 true, i1 false
  ret i1 %.
}

Which seems to have the correct return type i1

However, LLVM generates the following binary code

   0x00007ffff79b2000:  endbr64
   0x00007ffff79b2004:  movabs $0x7ffff79b3000,%rax
   0x00007ffff79b200e:  mov    (%rax),%ecx                # ecx = 41
   0x00007ffff79b2010:  mov    0x4(%rax),%eax             # eax = 42
   0x00007ffff79b2013:  xor    $41,%ecx                   # ecx = 0
   0x00007ffff79b2016:  xor    $42,%eax                   # eax = 0
   0x00007ffff79b2019:  or     %ecx,%eax                  # eax = 0
   0x00007ffff79b201b:  sete   %al                        # al = 1
   0x00007ffff79b201e:  neg    %al                        # al = (0-1) = 255
   0x00007ffff79b2020:  retq

Whenever the evaluation is true, the return is 255


To be more confusing, this behavior was introduced from LLVM9

Below code is generated by LLVM8, that is correct, without neg

"/.t19709":                             # @"/.t19709"
        mov     eax, dword ptr [rip + __unnamed_1]
        mov     ecx, dword ptr [rip + __unnamed_1+4]
        xor     eax, 41
        xor     ecx, 42
        or      ecx, eax
        sete    al
        ret
__unnamed_1:
        .long   41                      # 0x29
        .long   42                      # 0x2a

See precise code comparison on godbolt

WIP

mo-xiaoming commented 3 years ago

fixed in 90518246d929c1b6dc30a6812912b2f745d42903

smunix commented 3 years ago

fixed in #414