angr / angr

A powerful and user-friendly binary analysis platform!
http://angr.io
BSD 2-Clause "Simplified" License
7.42k stars 1.07k forks source link

angr can't handle 'endbr64', leading to incorrect function boundary #4571

Open hwu71 opened 4 months ago

hwu71 commented 4 months ago

Description

While examining the recent XZ backdoor using angr, it was observed that angr failed to correctly recognize the endbr64 instruction, which then led to incorrect function boundaries.

For reference, in IDA, the instruction at 0x144d0 is endbr64: image

However, in angr, endbr64 was not recognized. Instead, angr incorrectly splits function 0x4144d0 into several functions: 0x4144d0, 0x4144d3, 0x4144d4. image

In a previous discussion about endbr64 in 2018 (https://github.com/angr/angr/issues/1212), it was mentioned that the problem doesn't stem from angr directly, but rather from a lack of implementation in the underlying VEX IR.

Given this context, is the improper function boundary identification considered a bug? Moreover, how to fix this issue without altering the VEX IR?

Steps to reproduce the bug

binary: liblzma.so.5.5.99.zip function address: 0x4144d0

Environment

Everything is in version 9.2.99.dev0 (current latest version)

Additional context

No response

ltfish commented 4 months ago

We don't need to alter VEX IR at all (we already added support to endbr32 and endbr64). The problem is that we are not treating endbr32/64 as function prologue sequences. I'll add them later.

hwu71 commented 4 months ago

Thanks a lot for your explanation!