uxmal / reko

Reko is a binary decompiler.
https://uxmal.github.io/reko
GNU General Public License v2.0
2.18k stars 253 forks source link

8051 bank switch / jump table #826

Open kimstik opened 4 years ago

kimstik commented 4 years ago

Code banking: Many modern 8051 chips (cc2530/C8051F120/..) have code memory address extension by banking. Ref1, Ref2 How can I deal with it?

Switch jump table: Compilers usually emit jump table for switch statement. How can i specify switch paradigm for 8051?

simple test code attached

uxmal commented 4 years ago

It seems some of sort of support for overlays would be needed to support this. Alternatively, you would decompile the individual banks separately, providing support for missing code by specifying (function-address, function-siguature) place holders.

Regarding switch statements: I'm relatively new to the 8051 processor so I need to spend some time looking at that sample you provided. Watch this space for updates.

kimstik commented 4 years ago

Thank you for rocket fast response. Perhaps we should do some test-cases of SDCC/Keil/IAR with switch statements runtime.

uxmal commented 4 years ago

I've looked at the binary further. The bank-switching is accomplished by writing values to a model-specific register, then transferring control by pushing values on the stack and abusing the ret instruction to make the jump.

Reko in its current state is a little naive and assumes that no shenanigans are being done with return addresses. It's becoming clear that even in non-malware/obfuscated software, the trick of pushing the a transfer address on the stack and jumping to it via ret is common enough that the scanner must start taking this into account. I will fold that into the current work underway in the scanner.

The new notion is that, just like the logical frame pointer is made explicit in the Reko IR, we must also reify the linkAddress. Thus, an x86 program would have the following instructions in its entry block:

    esp = fp
    Mem0[esp] = __linkAddress

and the ret instructions would be altered to:

    esp = esp + 4
    goto Mem[esp]

The analysis phase will hopefully propagate registers and discover that the last statements become

    esp = fp
    goto Mem[fp]

and that Mem[fp] contains __linkAddress

Another way of stating this is to make Reko generate code in continuation passing style (CPS) and in later stages clean up the CPS.

As for switch statements, the one in the 8051 code is following a pattern Reko is failing to detect. I will be working on improving the switch detector during the week.

uxmal commented 4 years ago

Here is the relevant switch code:

0082 E0         movx A,@DPTR    ; fetch the index
0083 FF         mov R7,A    
0084 24 FC      add A,0xFC      ; indices >= 4 will cause a carry.
0086 40 38      jc 00C0         ; If carry, go to the default case

0088 EF         mov A,R7        ; Adjust index to offset by doubling it
0089 2F         add A,R7
008A 90 00 8E   mov DPTR,0x008E ; DPTR is now the jump table
008D 73         jmp @DPTR+A     ; indirect jump.

;; Jump table
008E 80 06      sjmp 0096
0090 80 0F      sjmp 00A1
0092 80 18      sjmp 00AC
0094 80 21      sjmp 00B7

Reko is failing because of the "clever" hack used to perform bounds checking on the switch. The add A,0xFC will set the C flag if A is >= 4. I need to add code to handle this way of doing bounds checking.

uxmal commented 4 years ago

Reko can now handle the code above. You mentioned earlier you were considering preparing a few more samples of switch statements. Those would be greatly appreciated -- I don't have the toolset to make them myself. If you do make such samples, consider separating code banking from switch statements at first, to simplify the work.

kimstik commented 4 years ago

Keil have few paradigms explained here Example with library no-return function ?C?CCASE @ address 0x8db attached. Call made from 0x88d ccase_keil_88D.zip As I know Keil totally expose 3 library functions: ?C?CCASE, ?C?ICASE, ?C?LCASE Body of them can be easily shown by radare2:

?C?CCASE
rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E4737402936860EFA3A3A380DF

?C?ICASE: rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E473740293B5F0067403936860E9A3A3A3A380D8

?C?LCASE:
rasm2.exe -a 8051 -d D083D082E4937012740193700DA3A393F8740193F5828883E4737402936C70127403936D700C7404936E70067405936F60DDA3A3A3A3A3A380CA
kimstik commented 4 years ago

Nice IDA "Switch Idiom" screenshots here: https://reverseengineering.stackexchange.com/questions/20112/how-can-i-make-ida-understand-this-switch-statement-with-a-signed-jump-table?rq=1

uxmal commented 4 years ago

Reko has a similar dialog already. It should appear when clicking on a warning that an indirect jump cannot be resolved. I will look at this in a few hrs

uxmal commented 4 years ago

@kimstik: Reko's i8051 implementation now understands Keil's "sparse" switch statements. I will close this issue now, as the immediate concern has been addressed. Let me know if you discover further issues.

kimstik commented 4 years ago

May we keep it open for bank switch?

uxmal commented 4 years ago

Sure. I'll look at bank switching next.

kimstik commented 4 years ago

I reduced SDCC switch testcase by removing banking. reko do not managed SDCC switch:

;; fn0067: 0067
;;   Called from:
;;     0093 (in fn0006)
fn0067 proc
    mov A,[0082]
    mov R7,A
    add A,FC
    jc  0088

l006E:
    mov A,R7
    add A,R7
    mov DPTR,0074
    jmp @DPTR+A
0074             80 06 80 07 80 08 80 09 02 00 66 02     ..........f.
0080 00 65 02 00 66 02 00 65                         .e..f..e       

l0088:
    ret

sdcc_switch.zip

kimstik commented 4 years ago

Keil have few paradigms explained here Example with library no-return function ?C?CCASE @ address 0x8db attached. Call made from 0x88d ccase_keil_88D.zip As I know Keil totally expose 3 library functions: ?C?CCASE, ?C?ICASE, ?C?LCASE Body of them can be easily shown by radare2:

?C?CCASE
rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E4737402936860EFA3A3A380DF

?C?ICASE: rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E473740293B5F0067403936860E9A3A3A3A380D8

?C?LCASE:
rasm2.exe -a 8051 -d D083D082E4937012740193700DA3A393F8740193F5828883E4737402936C70127403936D700C7404936E70067405936F60DDA3A3A3A3A3A380CA

Please find Keil ICASE & LCASE tests attached icase_keil.zip lcase_keil.zip

uxmal commented 4 years ago

Update: I'm working on getting these three cases to pass. Thanks for providing the samples BTW.

kimstik commented 4 years ago

Few public 8051 binaries with bank switching (TI CC2451): https://github.com/RedBearLab/CCLoader/tree/master/Bin