thumb2eb parses file as little endian for raw binary

Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja

https://binary.ninja/

MIT License

896 stars 200 forks source link

thumb2eb parses file as little endian for raw binary #615

Closed timbrom closed 7 years ago

timbrom commented 7 years ago

I'm reversing a raw binary firmware image that is big-endian thumb. When I try to create a function at an address Binary ninja gives the same results whether I select thumb2 or thumb2eb as the architecture. I'm attaching a snippet here, the first instruction should be parsed as a push {r3, r4, r5, r6, r7, r8, r9, r10, r11, lr} rather than a cmp r5 #0xe9 binja_bug.txt

lwerdna commented 7 years ago

thanks for the report! fixed in a test branch, will mark resolved when merged into dev

alexforencich commented 7 years ago

seeing a similar issue with 68k, though I am looking at values for interrupt vectors set with define_user_data_var instead of instructions.

psifertex commented 7 years ago

Branch was merged and should be resolved on dev, marking as closed. The 68k bug is specific to your particular architecture module, if there's anything else in the core that needs fixing @alexforencich, please open a new ticket or ping us on slack. Thanks.

alexforencich commented 7 years ago

Is this fixed in 1.0.702-dev? If so, I will open a new bug report because I am still seeing the wrong endianness in the linear disassembly.

ehntoo commented 7 years ago

thumb2eb support is relevant to my interests. I gave this a shot with a TMS470M (a big-endian Cortex M3 part from TI) demo image, and still seem to be getting little-endian disassembly. http://processors.wiki.ti.com/images/7/74/TMS470M_Demo.zip

Here's the disassembly of the entrypoint from arm-none-eabi-objdump:

    3fec:       4812            ldr     r0, [pc, #72]   ; (4038 <STCGSTAT>)
    3fee:       6801            ldr     r1, [r0, #0]
    3ff0:       2901            cmp     r1, #1
    3ff2:       f43f aeb0       beq.w   3d56 <_Continue_after_STC>
    3ff6:       f000 f856       bl      40a6 <_stackPointer_>
    3ffa:       f7fe ff99       bl      2f30 <_init>
    3ffe:       f7fe fff0       bl      2fe2 <_deviceSettings>

compared to binary ninja 1.0.703-dev: tms470m_bn

psifertex commented 7 years ago

@alexforencich: The fix we thought was in (@lwerdna is double checking it now) was for armeb/thumbeb specifically. Are you referring to 68k or arm?

alexforencich commented 7 years ago

I was referring to the endianness issue with 68k, for which I have opened a separate issue.

lwerdna commented 7 years ago

The generated thumb2 disassembler only had native endian fetches. Users @Timbrom and @ehntoo may be the first to actually try a big endian thumb2 binary. Unit tests for this case were needed and are incoming.

Commit 2478fabfc544b7d390c5de2d5beed3f5810acb4d pulls the instruction fetch responsibility out of the generated code because the generated disassembler fetches and examines the instruction many times and I was afraid of slowing it down with endian checks. Now whoever invokes the disassembler (mainly arch_thumb2) is responsible for the fetch, and the generated code quickly accesses the new .instr_word16 and .instr_word32 members of the disassembler request struct.

Thanks for the continued testing and bug reports!