Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
927 stars 209 forks source link

RISC-V: Lifting bug in `JALR rd, rs1, imm` when `rd == rs1` #6003

Open jeanmicheldeva opened 3 weeks ago

jeanmicheldeva commented 3 weeks ago

See here: https://github.com/Vector35/binaryninja-api/blob/7d0b6bc3f49070a10d6257628bbdf93e52d87fed/arch/riscv/src/lib.rs#L1220-L1234

If rd == rs1, but is neither zero or ra (x0 or x1 resp.), the above code will lift the jalr rd, rs1, imm instruction as follows:

# let's say rd == rs1 == t1
# target = t1 + imm
t1 = pc + 4            # inst_len==4
jump(target) <=> jump(t1 + imm) <=> jump(pc + 4 + imm)

Whereas the intended code should be lifted as:

tmp_register = t1 + imm
t1 = pc + 4
jump(tmp_register)
jeanmicheldeva commented 2 weeks ago

I patched it as follows, and it seems to work:

let target = il.add(max_width, Register::from(rs1), imm).build();
match (rd.id(), rs1.id(), imm) {
    (0, 1, 0) => il.ret(target).append(),  // jalr zero, ra, 0
    (1, _, _) => il.call(target).append(), // indirect call
    (0, _, _) => il.jump(target).append(), // indirect jump
    (_, _, _) => {
        // indirect jump with storage of next address to non-`ra` register
        if rd.id() == rs1.id() {
            let tmp: llil::Register<Register<D>> = llil::Register::Temp(0);
            il.set_reg(max_width, tmp, target).append();
            il.set_reg(
                max_width,
                Register::from(rd),
                il.const_ptr(addr.wrapping_add(inst_len)),
            )
            .append();
            il.jump(tmp).append();
        } else {
            il.set_reg(
                max_width,
                Register::from(rd),
                il.const_ptr(addr.wrapping_add(inst_len)),
            )
            .append();
            il.jump(target).append();
        }
    }
}

However, I encountered another problem: "functions" that are called with jalr rd, rs1, imm usually end with jr rd. Obviously, the control-flow is the target "function" is unresolved, since rd does not have defined value in the context of the function.

As a matter of fact, this target "function" (ending with jr rd) is most probably just one or multiple basic blocks that are shared between multiple calling functions for space optimization.

Is there any way to tell BN to treat these target "functions" as such ? (i.e. not functions, but basic blocks to reattach to different parent functions). Doing this manually would be the equivalent of "Append function tail" feature in IDA ; that would be great is BN would handle this automatically (BN usually has no problem with shared basic blocks IIRC)

Or maybe my patch is incomplete, and this is already handled.

Anyway, thanks in advance

jeanmicheldeva commented 2 weeks ago

I found that modifying the following code solves the problem: https://github.com/Vector35/binaryninja-api/blob/7d0b6bc3f49070a10d6257628bbdf93e52d87fed/arch/riscv/src/lib.rs#L704-L715

There is a missing branch information for the case where rd != 0, so I added the following else clause

Op::Jalr(ref i) => {
    // TODO handle the calls with rs1 == 0?
    if i.rd().id() == 0 {
        let branch_type = if i.rs1().id() == 1 {
            BranchInfo::FunctionReturn
        } else {
            BranchInfo::Unresolved
        };

        res.add_branch(branch_type, None);
    } else {
        res.add_branch(BranchInfo::Unresolved, None); // This
    }
}

And now everything works as intended

xusheng6 commented 1 week ago

@jeanmicheldeva would you like to submit a PR for this?