junxzm1990 / x86-sok

132 stars 20 forks source link

Wrong basic block boundaries in several binaries #31

Open aeflores opened 1 month ago

aeflores commented 1 month ago

I've noticed a few places where the ground truth seems to be wrong. This is, I believe, and different case from https://github.com/junxzm1990/x86-sok/issues/28 where capstone was to blame. In the following cases, the extract_gt script reports errors.

intel_executables/cpu2006/clang_Os/dealII_base.amd64-m64-ccr-Os

There is a fragment of code that looks like this:

__cxx_global_var_init.8:

          404027:   push RAX
          404028:   cmp BYTE PTR [RIP+_ZGVN4Data5SetUpINS_12Exercise_2_3ILi3EEELi3EE15right_hand_sideE],0
          40402f:   jne .L_40407d
          404031:   movsd XMM0,QWORD PTR [RIP+.L_57a5c8]
          404039:   mov EDI,OFFSET _ZN4Data5SetUpINS_12Exercise_2_3ILi3EEELi3EE15right_hand_sideE
          40403e:   mov ESI,1
          404043:   call _ZN16ConstantFunctionILi3EEC1Edj
...

But 404043: call _ZN16ConstantFunctionILi3EEC1Edj is not part of the ground truth, even though 40403e belongs to it. The extract_gt script produces the following output:

  BBL#29709 ( 18B) @0x00404031 - 0x00404043, BaseOff: 0x4031, SecOff:0x0621, Fixups: 0 , Type: BBL, Padding: 0x0, Fallthrough: Y
    BBL#29710 ( 10B) @0x00404027 - 0x00404031, BaseOff: 0x4027, SecOff:0x0617, Fixups: 0 , Type: FUN&&OBJ, Padding: 0x0, Fallthrough: N
    BBL#29711 ( 18B) @0x00404031 - 0x00404043, BaseOff: 0x4031, SecOff:0x0621, Fixups: 0 , Type: FUN&&OBJ, Padding: 0x0, Fallthrough: N
...
INFO:Found Gaps#1 in section .text, between 0x404043 - 0x4040d9, size: 150
...
INFO:Terminator 0x40403e:   mov esi, 1 is not JUMP , RET or JUMP
ERROR:291:successor addr 404043 not in basic block list
INFO:block 0x404031 to 0x404043, type is 0
WARNING:Terminator inst of BasicBlock 404027 does not have a fixup?
INFO:block 0x404027 to 0x404031, type is 0
INFO:Terminator 0x40403e:   mov esi, 1 is not JUMP , RET or JUMP
ERROR:The basic block 404031-404043 does not end of JUMP or RET, and is not FALLTHROUGH!. its instruction is 0x40403e:  mov esi, 1
INFO:block 0x404031 to 0x404043, type is 0

What could cause this instruction to be missing?

milc_base.aarch64-ccr-O2

dclock:
          4168bc:   fmov d0,xzr
          4168c0:   ret 

The 4168c0: ret instruction is missing from the ground truth (even though 4168bc: fmov d0,xzr is present).

The extract_gt log:

   BBL#2621 (  4B) @0x004168bc - 0x004168c0, BaseOff: 0x168bc, SecOff:0x15934, Fixups: 0 , Type: FUN, Padding: 0x0, Fallthrough: N
    BBL#2622 (  4B) @0x004168bc - 0x004168c0, BaseOff: 0x168bc, SecOff:0x15934, Fixups: 0 , Type: FUN, Padding: 0x0, Fallthrough: N
... 
INFO:Found Gaps#0 in section .text, between 0x4168c0 - 0x4168c4, size: 4
...
ERROR:The basic block 4168bc-4168c0 does not end of JUMP or RET, and is not FALLTHROUGH!. its instruction is 0x4168bc:  fmov    d0, xzr
INFO:block 0x4168bc to 0x4168c0, type is 0
INFO:Terminator 0x4168bc:   fmov    d0, xzr is not JUMP , RET or JUMP
ERROR:The basic block 4168bc-4168c0 does not end of JUMP or RET, and is not FALLTHROUGH!. its instruction is 0x4168bc:  fmov    d0, xzr
INFO:block 0x4168bc to 0x4168c0, type is 0

Same, what could be going wrong here? One thing I notice is that the basic block seems to be duplicated in these two examples. E.g. in the latter, we have BBL#2621 and BBL#2622 with the same boundaries. Could that have something to do?

soplex_base.arm32-gcc81-mthumb_final-O2

Here we have the following snippet:

          18232:   movs r3, #0
          18234:   ldr r3, [r3]
          18236:   udf #255

The udf instruction is missing from the ground truth, even though 18234: ldr r3, [r3] is present.

This pattern is generated when there is a null pointer access (see e.g. https://embedded.fm/blog/2017/3/6/exceptional-code) The extract_gt log contains the following:

 BBL#4282 (  4B) @0x00018232 - 0x00018236, BaseOff: 0x10232, SecOff:0xe932, Fixups: 0 , Type: TFUN, Padding: 0x0, Fallthrough: N
    BBL#4283 ( 12B) @0x00018238 - 0x00018244, BaseOff: 0x10238, SecOff:0xe938, Fixups: 0 , Type: TBBL, Padding: 0x0, Fallthrough: Y
...
INFO:Found Gaps#69 in section .text, between 0x18236 - 0x18238, size: 2
...
INFO:Terminator 0x18234:    ldr r3, [r3] is not JUMP , RET or JUMP
ERROR:The basic block 18232-18236 does not end of JUMP or RET, and is not FALLTHROUGH!. its instruction is 0x18234: ldr r3, [r3]
INFO:block 0x18232 to 0x18236, type is 0

This pattern in particular happens in a lot of the thumb binaries and the udf instruction seems to be missing every time (in close to 200 binaries). I would guess this is a different issue than the previous two.

bin2415 commented 1 month ago

Hello, the first two bugs arose from improper handling of duplicate sections.

The __cxx_global_var_init.8 was introduced following the compilation of step-14.cc, and the disassembly results are presented below:

Disassembly of section .text.startup:

0000000000000000 <__cxx_global_var_init.7>:
   0:   50                      push   %rax
   1:   80 3d 00 00 00 00 00    cmpb   $0x0,0x0(%rip)        # 8 <__cxx_global_var_init.7+0x8>
   8:   75 2e                   jne    38 <__cxx_global_var_init.7+0x38>
   a:   bf 00 00 00 00          mov    $0x0,%edi
   f:   be 01 00 00 00          mov    $0x1,%esi
  14:   e8 00 00 00 00          callq  19 <__cxx_global_var_init.7+0x19>
  19:   bf 00 00 00 00          mov    $0x0,%edi
  1e:   be 00 00 00 00          mov    $0x0,%esi
  23:   ba 00 00 00 00          mov    $0x0,%edx
  28:   e8 00 00 00 00          callq  2d <__cxx_global_var_init.7+0x2d>
  2d:   48 c7 05 00 00 00 00    movq   $0x1,0x0(%rip)        # 38 <__cxx_global_var_init.7+0x38>
  34:   01 00 00 00
  38:   58                      pop    %rax
  39:   c3                      retq

Disassembly of section .text.startup:

0000000000000000 <__cxx_global_var_init.8>:
   0:   50                      push   %rax
   1:   80 3d 00 00 00 00 00    cmpb   $0x0,0x0(%rip)        # 8 <__cxx_global_var_init.8+0x8>
   8:   75 4c                   jne    56 <__cxx_global_var_init.8+0x56>
   a:   f2 0f 10 05 00 00 00    movsd  0x0(%rip),%xmm0        # 12 <__cxx_global_var_init.8+0x12>
  11:   00
  12:   bf 00 00 00 00          mov    $0x0,%edi
  17:   be 01 00 00 00          mov    $0x1,%esi
  1c:   e8 00 00 00 00          callq  21 <__cxx_global_var_init.8+0x21>
  21:   48 c7 05 00 00 00 00    movq   $0x0,0x0(%rip)        # 2c <__cxx_global_var_init.8+0x2c>
  28:   00 00 00 00
  2c:   48 c7 05 00 00 00 00    movq   $0x0,0x0(%rip)        # 37 <__cxx_global_var_init.8+0x37>
  33:   00 00 00 00
  37:   bf 00 00 00 00          mov    $0x0,%edi
  3c:   be 00 00 00 00          mov    $0x0,%esi
  41:   ba 00 00 00 00          mov    $0x0,%edx
  46:   e8 00 00 00 00          callq  4b <__cxx_global_var_init.8+0x4b>
  4b:   48 c7 05 00 00 00 00    movq   $0x1,0x0(%rip)        # 56 <__cxx_global_var_init.8+0x56>
  52:   01 00 00 00
  56:   58                      pop    %rax
  57:   c3                      retq

The ground truth is correct after compling step-14.cc:

       BBL# 293 ( 10B) [BBL] - Off:0x0000, Fixups:  2, padding:  0, FallThrough: Y (@Sec .text.startup)  [DUP]
        BBL# 294 ( 46B) [BBL] - Off:0x000a, Fixups:  7, padding:  0, FallThrough: Y (@Sec .text.startup)
        BBL# 295 (  2B) [FUN&&OBJ] - Off:0x0038, Fixups:  0, padding:  0, FallThrough: N (@Sec .text.startup)
        BBL# 296 ( 10B) [BBL] - Off:0x0000, Fixups:  2, padding:  0, FallThrough: Y (@Sec .text.startup)  [DUP]
        BBL# 297 ( 76B) [BBL] - Off:0x000a, Fixups: 12, padding:  0, FallThrough: Y (@Sec .text.startup)  [DUP]
        BBL# 298 (  2B) [FUN&&OBJ] - Off:0x0056, Fixups:  0, padding:  0, FallThrough: N (@Sec .text.startup)  [DUP]
        BBL# 299 ( 10B) [BBL] - Off:0x0000, Fixups:  2, padding:  0, FallThrough: Y (@Sec .text.startup)  [DUP]
        BBL# 300 ( 46B) [BBL] - Off:0x000a, Fixups:  7, padding:  0, FallThrough: Y (@Sec .text.startup)  [DUP]
        BBL# 301 (  2B) [FUN&&OBJ] - Off:0x0038, Fixups:  0, padding:  0, FallThrough: N (@Sec .text.startup)  [DUP]

However, step-14.o contains duplicate sections(.text.startup), and the linker fails to handle them appropriately.

Note that there are four special sections that need to be handled. The compiled dealII.zip after fixing is attached.

bin2415 commented 1 month ago

Hello, for the third case, we identified that the root cause is the incorrect handling of the .inst 0xdeff pseudo code. Instead of generating the instruction udf #255, the GCC compiler emits the .inst pseudo code. Below is the example:

        bl      bfd_assert
        movs    r3, #0
        ldr     r3, [r3, #360]
        .bbInfo_BE 0
        .inst   0xdeff
        .bbInfo_FUNE

We are planing to handle the corner case in gas assembler.

aeflores commented 1 month ago

Thanks a lot for the quick response! I guess that means some of the binaries need to be rebuilt. Right? Do you have by any chance scripts for rebuilding the complete datasets (https://zenodo.org/record/6566082/)? I've looked around but all the scripts seem to assume the binaries have been prebuilt.

aeflores commented 1 month ago

For completeness (for other people using this dataset), some arm (non-thumb) binaries have also problems with udf. In that case with udf #0 instead of udf 255. Most versions of dwp and one version of ls.gold.

For each binary, I provide a snippet of assembly where "GOOD" instructions are instructions present in the ground truth, and "BAD" instructions are absent. In all of these, "GOOD" instructions must fallthrough to "BAD" instructions, which tells me something wrong is going on with the ground truth.

Here is a list of all the other binaries where I have found inconsistent ground truth. Some of these might be due to the duplicate section problem or maybe there is something else going on.

aarch64

x86