Extend fall-through conditions to cover more cases.

pgoodman commented 3 years ago

Right now we have the following relations which detect a common pattern of trailing padding/NOPs between the last jmp or ret in a function, and the beginning of the subsequent function. The relations look as follows:


; Keep track of a linear sequence of instructions that
; falls-through to the beginning of a function.
#local falls_through_to_function(EA)

; The base case is an instruction that falls-through
; into the head of a function.
falls_through_to_function(EA)
    : raw_transfer(EA, FuncEA, EDGE_FALL_THROUGH)
    , function(FuncEA).

; The inductive case is an instruction that falls-through
; to another instruction that falls-through to a function.
falls_through_to_function(EA)
    : raw_transfer(EA, ToEA, EDGE_FALL_THROUGH)
    , falls_through_to_function(ToEA).

; Often times there is padding between functions. This manifests
; as one function ending in a `ret` or ` jmp`, followed by some
; padding NO-OPs, followed by the head of another function. We
; don't want any of the instructions following the `ret`/`jmp`
; to be included as reachable from inside a function if they fall
; through in this way, and so here we restrict the pseudo fall-
; through
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , !falls_through_to_function(ToEA).

; If a terminator instruction, e.g. `jmp` or `ret` is immediately
; followed by a function head then we don't want to treat the
; pseudo-flow as being an inter-procedural flow.
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , !falls_through_to_function(ToEA)
    , !function(ToEA).

We should extend these relations as follows:

[x] If we fall through into a non-instruction, then we should not treat the transfer as pseudo-transfer.
[x] If we fall through into a new section, then maybe we should not treat the transfer as pseudo-transfer. Think about this a bit more.
[x] If we fall-through to an error instruction, then we should not treat the transfer as pseudo-transfer.

stevenagy commented 3 years ago

I've augmented transfer.dr to cover these:

; Omit pseudo fall-throughs that transfer to new sections.
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , !section_start(ToEA).

; Omit pseudo fall-throughs that transfer to new functions.
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , !falls_through_to_function(ToEA)
    , !function(ToEA).

; Omit pseudo fall-throughs that transfer to function padding
; (i.e., sequences of instructions directly following one 
; function that fall-through to the next function).
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , !falls_through_to_function(ToEA).

; Omit pseudo fall-throughs that transfer to non-instructions.
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , instruction(ToEA, InsnType, _).

; Omit pseudo fall-throughs that transfer to error instructions
; (e.g., x86's `hlt` and `ud2`).
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , instruction(ToEA, InsnType, _)
    , InsnType != INSN_HALT.

; Here we have an edge that isn't a jump or pseudo fall-through,
; so we pass it through.
transfer(FromEA, ToEA, EdgeType)
    : fixed_transfer(FromEA, ToEA, EdgeType)
    , EdgeType != EDGE_JUMP_TAKEN
    , EdgeType != EDGE_PSEUDO_FALL_THROUGH.

; Here we have an edge that is a jump, and the target of
; the jump is a function head, and so we want to change its
; interpretation to be a tail-call.
transfer(FromEA, ToEA, EDGE_TAIL_FUNCTION_CALL)
    : fixed_transfer(FromEA, ToEA, EDGE_JUMP_TAKEN)
    , function(ToEA).

; Here we have an edge that is a jump, and the target of
; the jump is a function head, and so we want to change its
; interpretation to be a tail-call.
transfer(FromEA, ToEA, EDGE_JUMP_TAKEN)
    : fixed_transfer(FromEA, ToEA, EDGE_JUMP_TAKEN)
    , !function(ToEA).

Regarding pseudo-edges ending in section heads -- I think it's right to say that in general we should omit these. For x86 ELF's this eliminates the pseudo-edges from .init to .plt, from .plt to .text, and .text to .fini; but I could see how we'd want to capture the edge between .text to .fini, maybe as some kind of "pseudo inter-section" transfer?

I'm not clear what the implications are for binaries with mixed code/data sections like Mach-O.

pgoodman commented 3 years ago

I think you want to merge both of these into one, otherwise the first rule lets through everything that the second rule rejects ;-)

; Omit pseudo fall-throughs that transfer to non-instructions.
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , instruction(ToEA, InsnType, _).

; Omit pseudo fall-throughs that transfer to error instructions
; (e.g., x86's `hlt` and `ud2`).
transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    : fixed_transfer(FromEA, ToEA, EDGE_PSEUDO_FALL_THROUGH)
    , instruction(ToEA, InsnType, _)
    , InsnType != INSN_HALT.

I'm not clear yet either in terms of the .text. and .fini part. My recommendation is that you document the concern in a comment in the rule of interest, and maybe put it as a TODO. The way I write TODOs is like: TODO(snagy): ... or TODO(pag): ... so that for a future reader, they quickly know who wrote it and who to ask.

stevenagy commented 3 years ago

Done!

lifting-bits / dds

Extend fall-through conditions to cover more cases. #5