Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
899 stars 203 forks source link

Array access not being recognized in MIPS #5927

Open emesare opened 5 days ago

emesare commented 5 days ago

Discussed in https://github.com/Vector35/binaryninja-api/discussions/5914

Originally posted by **nwroot** September 9, 2024 In this binary, $s5 is set to a pointer and below, a function call is performed using $s5 as the address for an array of structures ``` 800370a4 0480023c lui $v0, 0x8004 800370a8 a8855524 addiu $s5, $v0, -0x7a58 {structure_ptr1} ... 80037124 21801502 addu $s0, $s0, $s5 80037128 0400048e lw $a0, 4($s0) {struct1::filename} 8003712c 0480053c lui $a1, 0x8004 80037130 909da524 addiu $a1, $a1, -0x6270 {overlay} 80037134 c6dc000c jal 0x80037318 {some_function} ``` LLIL ``` 17 @ 800370a4 $v0 = 0x80040000 18 @ 800370a8 $s5 = $v0 - 0x7a58 ... 50 @ 80037124 $s0 = $s0 + $s5 51 @ 80037128 $a0 = [$s0 + 4 {struct1::filename}].d 52 @ 8003712c $a1 = 0x80040000 53 @ 80037130 $a1 = $a1 - 0x6270 54 @ 80037134 call(some_function) ``` However, in line 50 of the MLIL binja for some reason expresses it as a subtraction. This seems to cause ugly syntax in the HLIL ($s0 is essentially sizeof(struct1) * some_index) ``` 50 @ 80037124 $s0 = $s0 - 0x7ffc7a58 51 @ 80037128 $a0 = [$s0 + 4].d 52 @ 8003712c $a1 = 0x80040000 53 @ 80037130 $a1 = 0x80039d90 54 @ 80037134 ...registers... = call(some_function $a0, $a1, stack = &var_30) ``` Resulting HLIL ``` some_function((index * 0xc - 0x7ffc7a58)->filename, &some_memory) ``` I apologize if this is hard to read, I would rather not reveal much detail of what I'm working on. I can provide the bndb if required.

Binary is available for V35 with "analysis inn spot shave".

emesare commented 5 days ago

The root cause of this seems to be that the constant is not recognized as a pointer to a data var, which then prohibits the array access from being resolved in later IL's.

emesare commented 5 days ago

This was solved by promoting all constants in MLIL to a constant pointer if the constant is the address of a data variable. This is a wide reaching change that is probable to cause other issues pending further discussion.

Dev builds >=6060 currently have this new behavior however it is subject to change.