fkie-cad / cwe_checker

cwe_checker finds vulnerable patterns in binary executables
https://docs.cwe-checker.io
GNU Lesser General Public License v3.0
1.1k stars 115 forks source link

panic in graph.rs because unwrap() of None #461

Closed ElDavoo closed 4 months ago

ElDavoo commented 5 months ago
docker run --rm -e RUST_BACKTRACE=1 -v ./ip:/input ghcr.io/fkie-cad/cwe_checker /input
thread 'main' panicked at src/cwe_checker_lib/src/analysis/graph.rs:339:85:
called `Option::unwrap()` on a `None` value
stack backtrace:
   0: rust_begin_unwind
             at ./rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
   1: core::panicking::panic_fmt
             at ./rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
   2: core::panicking::panic
             at ./rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:144:5
   3: cwe_checker_lib::analysis::graph::GraphBuilder::add_jump_edge
   4: cwe_checker_lib::analysis::graph::get_program_cfg_with_logs
   5: cwe_checker::main
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

ip.tar.gz

vobst commented 5 months ago

Thanks for the report, I will take a look at it in the next few days.

vobst commented 5 months ago

Thanks for bringing this up, looks like a genuine bug.

tl;dr: This is a problem with the normalization passes that we perform on the intermediate representation before building the final CFG. The 'propagate control flow' pass removes a basic block without retargeting all references to this block.

In particular, the relevant bits of unopimized IR look like this:

  BLK [blk_0040a8e8]
    DEF [instr_0040a8e8_0] $U1700:1(temp) = (v0:4 != 0x0:4)
    DEF [instr_0040a8e8_1] $U100:4(temp) = (sp:4 + 0x18:4)
    DEF [instr_0040a8e8_2] $U180:4(temp) = 0x0:4
    DEF [instr_0040a8e8_3] $U180:4(temp) = $U100:4(temp)
    DEF [instr_0040a8e8_4] v0:4 := Load from $U180:4(temp)
    JMP [instr_0040a8e8_5] If $U1700:1(temp) jump to blk_0040a9cc
    JMP [instr_0040a8e8_6] Jump to blk_0040a8f0
  BLK [blk_0040a8f0]
    DEF [instr_0040a8f0_0] $U1200:1(temp) = (v0:4 == 0x0:4)
    DEF [instr_0040a8f0_1] $U5500:1(temp) = (v0:4 < 0x101:4)
    DEF [instr_0040a8f0_2] v1:4 = IntZExt($U5500:1(temp))
    JMP [instr_0040a8f0_3] If $U1200:1(temp) jump to blk_0040a9cc
    JMP [instr_0040a8f0_4] Jump to blk_0040a8f8
...
  BLK [blk_0040a9c4]
    DEF [instr_0040a9c4_0] ra:4 = 0x40a9cc:4
    JMP [instr_0040a9c4_1] call sub_00403f80 ret blk_0040a9cc
  BLK [blk_0040a9cc]
    DEF [instr_0040a9cc_0] a0:4 = (0x43:4 << 0x10:4)
    JMP [instr_0040a9cc_1] Jump to blk_0040a9d0
  BLK [blk_0040a9d0]
    DEF [instr_0040a9d0_0] a0:4 = (a0:4 + 0xffffb730:4)
    JMP [instr_0040a9d0_1] Jump to blk_0040a9d4

Before the problematic pass we have (expression propagation and dead variable elimination have happened);

  BLK [blk_0040a8e8]
    DEF [instr_0040a8e8_0] $U1700:1(temp) = (v0:4 != 0x0:4)
    DEF [instr_0040a8e8_4] v0:4 := Load from (sp:4 + 0x18:4)
    JMP [instr_0040a8e8_5] If $U1700:1(temp) jump to blk_0040a9cc
    JMP [instr_0040a8e8_6] Jump to blk_0040a8f0
  BLK [blk_0040a8f0]
    DEF [instr_0040a8f0_2] v1:4 = IntZExt((v0:4 < 0x101:4))
    JMP [instr_0040a8f0_3] If (v0:4 == 0x0:4) jump to blk_0040a9cc
    JMP [instr_0040a8f0_4] Jump to blk_0040a8f8
...
  BLK [blk_0040a9c4]
    DEF [instr_0040a9c4_0] ra:4 = 0x40a9cc:4
    JMP [instr_0040a9c4_1] call sub_00403f80 ret blk_0040a9cc
  BLK [blk_0040a9cc]
    JMP [instr_0040a9cc_1] Jump to blk_0040a9d0
  BLK [blk_0040a9d0]
    DEF [instr_0040a9d0_0] a0:4 = ((0x43:4 << 0x10:4) + 0xffffb730:4)
    JMP [instr_0040a9d0_1] Jump to blk_0040a9d4

Note: The expression propagation across block boundaries is sound as a0:4 = (0x43:4 << 0x10:4) holds for all edges incoming to blk_0040a9d0.

And after the problematic pass we have:

  BLK [blk_0040a8e8]
    DEF [instr_0040a8e8_0] $U1700:1(temp) = (v0:4 != 0x0:4)
    DEF [instr_0040a8e8_4] v0:4 := Load from (sp:4 + 0x18:4)
    JMP [instr_0040a8e8_5] If $U1700:1(temp) jump to blk_0040a9d0
    JMP [instr_0040a8e8_6] Jump to blk_0040a8f0
  BLK [blk_0040a8f0]
    DEF [instr_0040a8f0_2] v1:4 = IntZExt((v0:4 < 0x101:4))
    JMP [instr_0040a8f0_3] If (v0:4 == 0x0:4) jump to blk_0040a9d0
    JMP [instr_0040a8f0_4] Jump to blk_0040a8f8
...
  BLK [blk_0040a9c4]
    DEF [instr_0040a9c4_0] ra:4 = 0x40a9cc:4
    JMP [instr_0040a9c4_1] call sub_00403f80 ret blk_0040a9cc
  BLK [blk_0040a9d0]
    DEF [instr_0040a9d0_0] a0:4 = ((0x43:4 << 0x10:4) + 0xffffb730:4)
    JMP [instr_0040a9d0_1] Jump to blk_0040a9d4

As we can see, the basic block at 0040a9cc was removed and the conditional jumps at 0040a8e8_5 and 0040a8f0_3 correctly retargeted. However, the optimization failed to retarget the call at 0040a9c4_1. Thus, the CFG construction fails for the transformed program when adding the edges for the call.

@Enkelmann Two ways of addressing this come to my mind: a) prevent the removal of the BB by properly checking that it still has incoming edges (Not quite clear to me yet why exactly the current check is not catching it, have to investigate that further.) b) Make sure to retarget the jump as well. Any suggestions?

Enkelmann commented 4 months ago

I see that you took route b) in your PR, which should result in a better CFG than route a). I approve, although it makes it more difficult to be sure that we get rid of this bug for good.

ElDavoo commented 4 months ago

Thank you for fixing!