Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
907 stars 207 forks source link

ARM Thumb2 TBB Instruction Incorrect Implemenation #5555

Open Martyx00 opened 3 months ago

Martyx00 commented 3 months ago

Version and Platform (required):

Bug Description: Binary Ninja fails to recognize Jump Table with ARM Thumb2 TBB instruction. There are potentially two issues:

  1. It appears that current handling of TBB instruction is assuming a half word (2-byte) jump table which is incorrect as the table should be single byte only. Half-word size is correct for TBH instruction not for TBB though.
  2. Current implementation fails to recognize the Jump table as it continues with the disassembly where the jump table should be. I would assume that the implementation fails to add label after the TBB.

Steps To Reproduce: Please provide all steps required to reproduce the behavior: Unfortunately the screenshot I have shared is the most I can provide (cannot share the library). I am happy to test any updates though so let me know. (Or we can have a screen sharing session in private)

Expected Behavior: TBB instruction should be followed by label to stop the disassembly and allow detecting of the jump table which must be one byte aligned.

Screenshots/Video Recording: image

Binary: If applicable, please provide us with the binary to help us work with the issue faster. Here are a few options: Not possible unless we have official NDA, sorry.

Additional Information: Please add any other context about the problem here.

negasora commented 3 months ago

Afaict, we're disassembling after the tbb because of the goto 15 at LLIL index 12 What are the first couple members of data_c041c40?

Martyx00 commented 3 months ago

Well, the goto 15 is actually correct as it goes to 0x0c041c30 which is a return from function in case of the switch variable being >= 9 (the default branch). You can see the members of data_c041c40 in the disassembly part where it should be the single byte aligned jump table but instead is disassembled to instructions (the part right after tbb).

Surg-Dev commented 3 months ago

I've had similar issues with TBB (as well as TBH):

Sometimes binja will decide to find the correct, single byte jump table, other times it assembles right over it. I'm not sure what causes the difference.

Surg-Dev commented 2 months ago

Alright, I did a bit more testing...

@Martyx00 do you have that memory section where the erroneous tbb instruction is marked as writeable?

Often in firmware rev we see that a section of memory we dump is part of executable ram/flash, so we mark it as rwx

I've reloaded a fw file I was working on and marked these sections as RX only, and it seems to have fixed it. (Haven't checked in full, though)

Martyx00 commented 2 months ago

@Surg-Dev you are correct, in RX only segments this works perfectly. I would still consider that to be a bug as it should behave the same way in RWX ones.

Surg-Dev commented 2 months ago

I would tend to agree - and you can get it to work in RWX manually: If you know how large the jump table should be, then defining it as a const uint8_t array will force Binary Ninja to analyze it correctly in a RWX section.

RWX has some interesting quirks because the IL compiler no longer assumes anything about pointers to strings/function pointers, and in this case, jump tables. For example if you have a data pointer to a function pointer, in RX, BN will automatically resolve to the function pointer and include it as such in HLIL. Typing any of these pointers as const will cause "normal" RX analysis.

In theory, you could have an extremely obfuscated program which writes to that jump table, jumps to it as code, writes it back, and runs tbb as normal. Obviously, this is an extremely degenerate case that I wouldn't expect to show up in compiled code. The current behavior is extremely odd, but after testing I can see why given the current logic of how BN decompiles code and resolves variables.

So this leads to a few possible solution concepts:

I can't speak to what would be the best course of action, or if any of these ideas are even good. In my opinion it's probably better to avoid RWX data sections when possible