Open kimstik opened 4 years ago
It seems some of sort of support for overlays would be needed to support this. Alternatively, you would decompile the individual banks separately, providing support for missing code by specifying (function-address, function-siguature) place holders.
Regarding switch statements: I'm relatively new to the 8051 processor so I need to spend some time looking at that sample you provided. Watch this space for updates.
Thank you for rocket fast response. Perhaps we should do some test-cases of SDCC/Keil/IAR with switch statements runtime.
I've looked at the binary further. The bank-switching is accomplished by writing values to a model-specific register, then transferring control by pushing values on the stack and abusing the ret
instruction to make the jump.
Reko in its current state is a little naive and assumes that no shenanigans are being done with return addresses. It's becoming clear that even in non-malware/obfuscated software, the trick of pushing the a transfer address on the stack and jumping to it via ret
is common enough that the scanner must start taking this into account. I will fold that into the current work underway in the scanner.
The new notion is that, just like the logical frame pointer is made explicit in the Reko IR, we must also reify the linkAddress
. Thus, an x86 program would have the following instructions in its entry block:
esp = fp
Mem0[esp] = __linkAddress
and the ret
instructions would be altered to:
esp = esp + 4
goto Mem[esp]
The analysis phase will hopefully propagate registers and discover that the last statements become
esp = fp
goto Mem[fp]
and that Mem[fp]
contains __linkAddress
Another way of stating this is to make Reko generate code in continuation passing style (CPS) and in later stages clean up the CPS.
As for switch statements, the one in the 8051 code is following a pattern Reko is failing to detect. I will be working on improving the switch detector during the week.
Here is the relevant switch code:
0082 E0 movx A,@DPTR ; fetch the index
0083 FF mov R7,A
0084 24 FC add A,0xFC ; indices >= 4 will cause a carry.
0086 40 38 jc 00C0 ; If carry, go to the default case
0088 EF mov A,R7 ; Adjust index to offset by doubling it
0089 2F add A,R7
008A 90 00 8E mov DPTR,0x008E ; DPTR is now the jump table
008D 73 jmp @DPTR+A ; indirect jump.
;; Jump table
008E 80 06 sjmp 0096
0090 80 0F sjmp 00A1
0092 80 18 sjmp 00AC
0094 80 21 sjmp 00B7
Reko is failing because of the "clever" hack used to perform bounds checking on the switch. The add A,0xFC
will set the C flag if A is >= 4. I need to add code to handle this way of doing bounds checking.
Reko can now handle the code above. You mentioned earlier you were considering preparing a few more samples of switch statements. Those would be greatly appreciated -- I don't have the toolset to make them myself. If you do make such samples, consider separating code banking from switch statements at first, to simplify the work.
Keil have few paradigms explained here Example with library no-return function ?C?CCASE @ address 0x8db attached. Call made from 0x88d ccase_keil_88D.zip As I know Keil totally expose 3 library functions: ?C?CCASE, ?C?ICASE, ?C?LCASE Body of them can be easily shown by radare2:
?C?CCASE
rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E4737402936860EFA3A3A380DF
?C?ICASE: rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E473740293B5F0067403936860E9A3A3A3A380D8
?C?LCASE:
rasm2.exe -a 8051 -d D083D082E4937012740193700DA3A393F8740193F5828883E4737402936C70127403936D700C7404936E70067405936F60DDA3A3A3A3A3A380CA
Nice IDA "Switch Idiom" screenshots here: https://reverseengineering.stackexchange.com/questions/20112/how-can-i-make-ida-understand-this-switch-statement-with-a-signed-jump-table?rq=1
Reko has a similar dialog already. It should appear when clicking on a warning that an indirect jump cannot be resolved. I will look at this in a few hrs
@kimstik: Reko's i8051 implementation now understands Keil's "sparse" switch statements. I will close this issue now, as the immediate concern has been addressed. Let me know if you discover further issues.
May we keep it open for bank switch?
Sure. I'll look at bank switching next.
I reduced SDCC switch testcase by removing banking. reko do not managed SDCC switch:
;; fn0067: 0067
;; Called from:
;; 0093 (in fn0006)
fn0067 proc
mov A,[0082]
mov R7,A
add A,FC
jc 0088
l006E:
mov A,R7
add A,R7
mov DPTR,0074
jmp @DPTR+A
0074 80 06 80 07 80 08 80 09 02 00 66 02 ..........f.
0080 00 65 02 00 66 02 00 65 .e..f..e
l0088:
ret
Keil have few paradigms explained here Example with library no-return function ?C?CCASE @ address 0x8db attached. Call made from 0x88d ccase_keil_88D.zip As I know Keil totally expose 3 library functions: ?C?CCASE, ?C?ICASE, ?C?LCASE Body of them can be easily shown by radare2:
?C?CCASE rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E4737402936860EFA3A3A380DF ?C?ICASE: rasm2.exe -a 8051 -d D083D082F8E4937012740193700DA3A393F8740193F5828883E473740293B5F0067403936860E9A3A3A3A380D8 ?C?LCASE: rasm2.exe -a 8051 -d D083D082E4937012740193700DA3A393F8740193F5828883E4737402936C70127403936D700C7404936E70067405936F60DDA3A3A3A3A3A380CA
Please find Keil ICASE & LCASE tests attached icase_keil.zip lcase_keil.zip
Update: I'm working on getting these three cases to pass. Thanks for providing the samples BTW.
Few public 8051 binaries with bank switching (TI CC2451): https://github.com/RedBearLab/CCLoader/tree/master/Bin
Code banking: Many modern 8051 chips (cc2530/C8051F120/..) have code memory address extension by banking. Ref1, Ref2 How can I deal with it?
Switch jump table: Compilers usually emit jump table for switch statement. How can i specify switch paradigm for 8051?
simple test code attached