Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
927 stars 209 forks source link

Enable Outlining of Inlined Standard Functions #3349

Open op2786 opened 2 years ago

op2786 commented 2 years ago

Compilers sometimes makes standard functions (strlen, memcpy, strcat, memset, strcmp, memcmp etc) inline. I guess their code pattern can be recognized and replaced with pseudo call to function.

Example disassembly:

1800382a2  488dbda0120000     lea     rdi, [rbp+0x12a0 {Dst}]
1800382a9  33c0               xor     eax, eax  {0x0}
1800382ab  b90c010000         mov     ecx, 0x10c
1800382b0  f3aa               rep stosb byte [rdi]  {0x0}  {0x0}  {0x0}

Output in HLIL:

1800382a2          char (* rdi_1)[0x110] = &Dst
1800382b0          for (int64_t rcx_4 = 0x10c; rcx_4 != 0; rcx_4 = rcx_4 - 1) {
1800382b0              *rdi_1 = 0
1800382b0              rdi_1 = &(*rdi_1)[1]
1800382b0          }

Which can be replaced memset(Dst, 0, 0x10c). It may be related to #2185.

fuzyll commented 2 years ago

This is a subset of the functionality that would be required for #2185, so we're leaving this issue to track automatically resolving standard library calls that get inlined. The other issue tracks being able to make any HLIL code into an inlined function.

plafosse commented 1 year ago

Currently we have partial support for this feature. It is currently limited to "Constant Data" i.e. When a string or data is "usually" written to sequential stack locations. We recover these and display them as one of:

TODO:

Recovery of non-"Constant Data" functions:

0xdevalias commented 5 months ago

Specific strcat related issue here:

And a (potentially more complicated) issue for C++ things like std::string: