GaloisInc / macaw

Open source binary analysis tools.
BSD 3-Clause "New" or "Revised" License
208 stars 21 forks source link

symbolic: `Could not identify block with external resolutions` on PLT stubs #447

Closed langston-barrett closed 1 month ago

langston-barrett commented 1 month ago

I have a binary that calls libc's assert function. When I symbolically execute it, I see:

 Goal failed:
    0x1000020c: error: in 0x10000200
    Could not identify block at 0x10000200 with external resolutions: []
#include <assert.h>
#include <stdbool.h>
void test() { assert(false); }
powerpc-linux-musl-gcc -fno-stack-protector -Wl,--unresolved-symbols=ignore-all -nostartfiles -no-pie test.c

bin.zip

powerpc-linux-musl-objdump -d tests/refine/bug/assert_false/test.ppc32.elf

10000200 <__assert_fail@plt>:
10000200:       3d 60 10 02     lis     r11,4098
10000204:       81 6b 00 00     lwz     r11,0(r11)
10000208:       7d 69 03 a6     mtctr   r11
1000020c:       4e 80 04 20     bctr

You can see that this message appears in the Crucible CFG using macaw-dump (#446):

Macaw and Crucible CFGs ``` cabal run exe:macaw-ppc-dump -- --crucible test.ppc32.elf 0x10000200: ; r1 = stack_frame + 0 # 0x10000200 0x10000200: lis r11, 4098 # 0x10000200: lis r11, 4098 # 0x10000200: {r11 => 0x10020000 :: [32];ip => 0x10000204} # 0x10000204 0x10000204: lwz r11, 0(r11) # 0x10000204: lwz r11, 0(r11) r6 := read_mem (0x10020000 :: [32]) (bvbe4) # 0x10000204: {r11 => r6;ip => 0x10000208} # 0x10000208 0x10000208: mtctr r11 # 0x10000208: mtctr r11 # 0x10000208: {ip => 0x1000020c;ctr => r6} # 0x1000020c 0x1000020c: bctr # 0x1000020c: bctr r7 := (bv_shr r6 (0x2 :: [32])) r8 := (trunc r7 30) r9 := (uext r8 32) r10 := (bv_shl r9 (0x2 :: [32])) # 0x1000020c: {ip => r10} classify failure { r11 = r6 , ip = r10 , ctr = r6 } Branch: IP is not an mux: let r6 := read_mem (0x10020000 :: [32]) (bvbe4) r7 := (bv_shr r6 (0x2 :: [32])) r8 := (trunc r7 30) r9 := (uext r8 32) r10 := (bv_shl r9 (0x2 :: [32])) in r10 No return call: Noreturn target r10 is not a valid address. Call: Call classifier failed. Return: Pattern match failure in 'do' block at src/Data/Macaw/Discovery/Classifier.hs:280:5-18 Jump table: Absolute jump table: Pattern match failure in 'do' block at src/Data/Macaw/Discovery/Classifier/JumpTable.hs:267:3-48 Jump table: Relative jump table: Unaligned IP not a mem offset: r6 PLT stub: Pattern match failure in 'do' block at src/Data/Macaw/Discovery/Classifier/PLT.hs:120:5-24 Jump: Jump value r10 is not a valid address. Tail call classification failed. %0 % 10000200 jump %1() % no postdom %1 % 10000200 $0 = (macawInstructionStart 0x10000200 0x0 "0x10000200: lis r11, 4098") % 10000200 $1 = bVLit(32, BV 268566528) % 10000200 $2 = extensionApp((bits_to_ptr_32 $1)) % 10000200 $3 = (global 0x10000204) % 10000200 $4 = (macawArchStateUpdate 0x10000200 {r11 => $2;ip => $3}) % 10000204 $5 = (macawInstructionStart 0x10000200 0x4 "0x10000204: lwz r11, 0(r11)") % 10000204 $6 = (macawReadMem bvbe4 $2) % 10000204 $7 = (global 0x10000208) % 10000204 $8 = (macawArchStateUpdate 0x10000204 {r11 => $6;ip => $7}) % 10000208 $9 = (macawInstructionStart 0x10000200 0x8 "0x10000208: mtctr r11") % 10000208 $10 = (global 0x1000020c) % 10000208 $11 = (macawArchStateUpdate 0x10000208 {ip => $10;ctr => $6}) % 1000020c $12 = (macawInstructionStart 0x10000200 0xc "0x1000020c: bctr") % 1000020c $13 = extensionApp((ptr_to_bits_32 $6)) % 1000020c $14 = bVLit(32, BV 2) % 1000020c $15 = extensionApp((bits_to_ptr_32 $14)) % 1000020c $16 = bVLshr(32, $13, $14) % 1000020c $17 = extensionApp((bits_to_ptr_32 $16)) % 1000020c $18 = (ptr_trunc 30 $17) % 1000020c $19 = (ptr_uext 32 $18) % 1000020c $20 = extensionApp((ptr_to_bits_32 $19)) % 1000020c $21 = bVShl(32, $20, $14) % 1000020c $22 = extensionApp((bits_to_ptr_32 $21)) % 1000020c $23 = (macawArchStateUpdate 0x1000020c {ip => $22}) % 1000020c $24 = stringLit("Could not identify block at 0x10000200 with external resolutions: []") % 1000020c error $24 % no postdom ```
langston-barrett commented 1 month ago

So it looks like macaw-symbolic is executing the PLT stub for __assert_fail, which winds up attempting to jump to this address:

    let r6 := read_mem (0x10020000 :: [32]) (bvbe4)
        r7 := (bv_shr r6 (0x2 :: [32]))
        r8 := (trunc r7 30)
        r9 := (uext r8 32)
        r10 := (bv_shl r9 (0x2 :: [32]))
     in r10

I think this is probably the corresponding GOT entry? Here's what readelf says about that address:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [14] .plt              PROGBITS        10020000 010000 000004 00  WA  0   0  4

Dynamic section at offset 0xff44 contains 17 entries:
  Tag        Type                         Name/Value
 0x00000003 (PLTGOT)                     0x10020000

Relocation section '.rela.plt' at offset 0x1b4 contains 1 entry:
 Offset     Info    Type            Sym.Value  Sym. Name + Addend
10020000  00000115 R_PPC_JMP_SLOT    00000000   __assert_fail + 0

Regardless of whether or not Macaw figures out that that address is a constant, it should probably recognize that it's performing code discovery at the beginning of a PLT stub, and somehow bail. PLT stubs are very special in that they invoke the runtime linker, and there's really no way Macaw would be able to meaningfully translate that code.

langston-barrett commented 1 month ago

Updated title because I realized this isn't at all PPC specific. I get the same message for the same reason on ARM and x86_64 versions of this program. The shared problem is just that executing a PLT stub gives this weird message.

RyanGlScott commented 1 month ago

PLT stubs are difficult for macaw's code discovery to handle, and this is one of the reasons why we have made an effort to offer other capabilities to discover the addresses of PLT stubs (e.g., this module) so that we can avoid performing code discovery on them at all. It is unclear to me whether we would be able to improve macaw's code discovery here without teaching it a lot more about the types of dynamic relocations that PLT stubs typically jump to (e.g., R_PPC_JMP_SLOT).

langston-barrett commented 1 month ago

Yeah, fair enough.