radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.57k stars 2.99k forks source link

STM8 architecture support #16498

Closed esclear closed 2 months ago

esclear commented 4 years ago

Is your feature request related to a problem? Please describe. It would be nice if r2 supported the STM8 architecture for disassembly.

Describe the solution you'd like Ideally a STM8 disassembler and corresponding analysis would be implemented as a plugin.

Describe alternatives you've considered An alternative solution is to use naken_asm, but this is missing many analysis features that r2 could provide.

Additional context The wikipedia page provides some documentation, more information is of course available in the official STM8 programming manual.

I've seen the radare2 plugin documentation, but it isn't that extensive in regards to the interfaces to radare.

valdaarhun commented 3 years ago

Hi. This seems like a very interesting issue. I think I will have to read up on a lot of stuff to tackle this, but nevertheless, I would like to work on this issue.

trufae commented 3 years ago

Cool! All yours :) but i think it will be better to work on this new arch in extras instead in core. It will be easier and it can be moved into core when needed

PaulWieland commented 2 years ago

Also looking for a good disassembler/visualizer for STM8 hex files. Subbing to this thread...

trufae commented 2 years ago

I dont know which format is this. Can you provide a sample file, documentation or implementation to look at

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

r2 -a r2ghidra -e asm.cpu=stm8 file

PaulWieland commented 2 years ago

@trufae send me an email and I will share a hex file with you

trufae commented 2 years ago

pancake@nopcode.org

valdaarhun commented 1 year ago

Hi. I am sorry for the long period of silence. I think I bit off more than I could chew back then. But now I think I am in a position to tackle this.

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

Given that r2ghidra can be used for stm8 binaries, is a separate plugin for stm8 still required?

trufae commented 1 year ago

sleigh is like 100 times slower than any native r2 plugin to disassemble/analyse anything, and the quality of the results is usually not as good because the translation to sleigh to esil is poor, also stm8 is a 3rd party plugin, so its not that well. maintained, so yeah, i think its always better to have everthing well maintained in the core and not to depend on other stuff unless you have no other options

valdaarhun commented 1 year ago

i think its always better to have everthing well maintained in the core and not to depend on other stuff unless you have no other options

Shall I add support for stm8 in https://github.com/radareorg/radare2/tree/master/libr/arch/p?

trufae commented 1 year ago

Yes. In case you want to implement support for stm8. The libr/arch is the right place

brainstorm commented 1 year ago

I dont know which format is this. Can you provide a sample file, documentation or implementation to look at

Right now you can disassemble, analyze and decompile stm8 binaries in r2 using the r2ghidra plugin:

r2 -a r2ghidra -e asm.cpu=stm8 file

I don't think Ghidra has STM8 as a built-in target right now, actually:

% r2 -a r2ghidra -e asm.cpu=stm8 flash.bin
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
SleightInit No sleigh specification for STM8:LE:64:default from STM8:LE:64:default:
 -- radare2 is WYSIWYF - what you see is what you fix
[0x00000000]>

However there are third party modules that you can add to your Ghidra extensions: https://github.com/esaulenka/ghidra_STM8

... and/or write an arch plugin for the stm8 in r2. Here's the datasheet for one of them in the family and the actual CPU instructions (opcodes).

Here I'm leaving a firmware I just dumped from controller and display boards from an exercise threadmill I found in the trash if you or other folks need more examples ;)

brainstorm commented 1 year ago

And if you need a text-based working disassembler today to compare while you implement support in r2, have a look at naken_asm:

$ ./naken_util -disasm -stm8 ~/dev/personal/stm8_glitch/flash.bin

naken_util - by Michael Kohn
                Joe Davisson
    Web: http://www.mikekohn.net/
  Email: mike@mikekohn.net

Version: January 29, 2023

Loaded bin /Users/rvalls/dev/personal/stm8_glitch/flash.bin from 0x0000 to 0x7fff
Type help for a list of commands.

Addr    Opcode Instruction                              Cycles
------- ------ ----------------------------------       ------
0x0000:  82 00 9a 03    int $9a03                                cycles=2
0x0004:  82 00 b5 3b    int $b53b                                cycles=2
0x0008:  82 00 b5 3b    int $b53b                                cycles=2
0x000c:  82 00 b5 3b    int $b53b                                cycles=2
0x0010:  82 00 b5 3b    int $b53b                                cycles=2
0x0014:  82 00 b5 3b    int $b53b                                cycles=2
0x0018:  82 00 b5 3b    int $b53b                                cycles=2
0x001c:  82 00 b5 3b    int $b53b                                cycles=2
0x0020:  82 00 b5 3b    int $b53b                                cycles=2
0x0024:  82 00 b5 3b    int $b53b                                cycles=2
0x0028:  82 00 b5 3b    int $b53b                                cycles=2
0x002c:  82 00 b5 3b    int $b53b                                cycles=2
0x0030:  82 00 b5 3b    int $b53b                                cycles=2
0x0034:  82 00 98 1e    int $981e                                cycles=2
0x0038:  82 00 b5 3b    int $b53b                                cycles=2
0x003c:  82 00 b5 3b    int $b53b                                cycles=2
0x0040:  82 00 b5 3b    int $b53b                                cycles=2
0x0044:  82 00 b5 3b    int $b53b                                cycles=2
0x0048:  82 00 b5 3b    int $b53b                                cycles=2
0x004c:  82 00 b5 3b    int $b53b                                cycles=2
0x0050:  82 00 b5 3b    int $b53b                                cycles=2
0x0054:  82 00 b5 3b    int $b53b                                cycles=2
0x0058:  82 00 8e a1    int $8ea1                                cycles=2
0x005c:  82 00 8e e3    int $8ee3                                cycles=2
0x0060:  82 00 b5 3b    int $b53b                                cycles=2
0x0064:  82 00 97 30    int $9730                                cycles=2
0x0068:  82 00 b5 3b    int $b53b                                cycles=2
0x006c:  82 00 b5 3b    int $b53b                                cycles=2
0x0070:  82 00 b5 3b    int $b53b                                cycles=2
0x0074:  82 00 b5 3b    int $b53b                                cycles=2
0x0078:  82 00 b5 3b    int $b53b                                cycles=2
0x007c:  82 00 b5 3b    int $b53b                                cycles=2
0x0080:  10 11          sub A, ($11,SP)                          cycles=1
0x0082:  12 eb          sbc A, ($eb,SP)                          cycles=1
0x0084:  28 b3          jrnv $39  (offset=-77)                   cycles=1-2
0x0086:  ba 78          or A, $78                                cycles=1
0x0088:  da db a8       or A, ($dba8,X)                          cycles=1
0x008b:  fb             add A, (X)                               cycles=1
0x008c:  fa             or A, (X)                                cycles=1
0x008d:  0a 1e          dec ($1e,SP)                             cycles=1
0x008f:  14 1e          and A, ($1e,SP)                          cycles=1
0x0091:  28 1e          jrnv $b1  (offset=30)                    cycles=1-2
0x0093:  14 14          and A, ($14,SP)                          cycles=1
0x0095:  14 0a          and A, ($0a,SP)                          cycles=1
0x0097:  0a 1e          dec ($1e,SP)                             cycles=1
0x0099:  28 28          jrnv $c3  (offset=40)                    cycles=1-2
0x009b:  32 32 32       pop $3232                                cycles=1
0x009e:  3c 3c          inc $3c                                  cycles=1
0x00a0:  0a 0a          dec ($0a,SP)                             cycles=1
0x00a2:  1e 32          ldw X, ($32,SP)                          cycles=2
0x00a4:  3c 32          inc $32                                  cycles=1
0x00a6:  28 32          jrnv $da  (offset=50)                    cycles=1-2
0x00a8:  28 1e          jrnv $c8  (offset=30)                    cycles=1-2
0x00aa:  0a 0a          dec ($0a,SP)                             cycles=1
0x00ac:  14 1e          and A, ($1e,SP)                          cycles=1
0x00ae:  28 32          jrnv $e2  (offset=50)                    cycles=1-2
0x00b0:  32 28 1e       pop $281e                                cycles=1
0x00b3:  1e 0a          ldw X, ($0a,SP)                          cycles=2
0x00b5:  0a 1e          dec ($1e,SP)                             cycles=1
0x00b7:  28 32          jrnv $eb  (offset=50)                    cycles=1-2
0x00b9:  28 1e          jrnv $d9  (offset=30)                    cycles=1-2
0x00bb:  14 0a          and A, ($0a,SP)                          cycles=1
0x00bd:  0a 0a          dec ($0a,SP)                             cycles=1
0x00bf:  0a 14          dec ($14,SP)                             cycles=1
0x00c1:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00c3:  32 3c 32       pop $3c32                                cycles=1
0x00c6:  28 14          jrnv $dc  (offset=20)                    cycles=1-2
0x00c8:  0a 0a          dec ($0a,SP)                             cycles=1
0x00ca:  1e 32          ldw X, ($32,SP)                          cycles=2
0x00cc:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00ce:  28 1e          jrnv $ee  (offset=30)                    cycles=1-2
0x00d0:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00d2:  0a 0a          dec ($0a,SP)                             cycles=1
0x00d4:  14 28          and A, ($28,SP)                          cycles=1
0x00d6:  28 14          jrnv $ec  (offset=20)                    cycles=1-2
0x00d8:  14 28          and A, ($28,SP)                          cycles=1
0x00da:  28 14          jrnv $f0  (offset=20)                    cycles=1-2
0x00dc:  0a 0a          dec ($0a,SP)                             cycles=1
0x00de:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00e0:  32 32 32       pop $3232                                cycles=1
0x00e3:  32 28 1e       pop $281e                                cycles=1
0x00e6:  0a 0a          dec ($0a,SP)                             cycles=1
0x00e8:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00ea:  1e 28          ldw X, ($28,SP)                          cycles=2
0x00ec:  1e 14          ldw X, ($14,SP)                          cycles=2
0x00ee:  14 14          and A, ($14,SP)                          cycles=1
0x00f0:  0a 0a          dec ($0a,SP)                             cycles=1
(...)
trufae commented 1 year ago

It's normal that r2ghidra doesnt catch the stm8 plugin, because STM8:LE:64:default: is not a valid id:

i just fixed that and pushed. (requires r2 update to support 24bit registers)

porting that stm8 disassembler to r2 can be done in 15min, will do that later, shoudln't take more than 15min

What i find out after those fixes is:

Actually stm8 is a very simple architecture and should be easy to add full support in r2. I plan to sync the ghidra decompiler with latest from the NSA before r2-5.9. But i dont have enough hands to handle that yet. So I'll ping you when the stm8 support is pushed in r2 (hopefully today)

trufae commented 1 year ago

Also, this code looks probably more updated and easy to contribute/integrate with r2 https://github.com/volbus/gmtdisas

trufae commented 1 year ago

Another sauce of inspiration https://github.com/derbroti/Stm8Ida

Any volunteer to extend Capstone with support for STM8? that can probably be the better place to benefit everyone in the RE scene

trufae commented 1 year ago

Just fixed some bugs in r2ghidra and its now usable for stm8

Screenshot 2023-05-02 at 17 53 46
brainstorm commented 10 months ago

Cannot repro your screenshot above :/

Would adding this 24 bit ghidra sleigh PR for stm8 help with the 24 bit errors at least?

$ r2 -a r2ghidra -e asm.cpu=stm8 flash.bin 
 -- Add comments using the ';' key in visual mode or the 'CC' command from the radare2 shell
[0x00000000]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
WARN: set your favourite calling convention in `e anal.cc=?`
INFO: Analyze symbols (af@@@s)
INFO: Recovering variables
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: find and analyze function preludes (aap)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Analyzing methods
INFO: Finding xrefs in noncode section (e anal.in=io.maps.x)
INFO: Emulate functions to find computed references (aaef)
WARN: Bit size 24 not supported
WARN: No SN reg alias for 'r2ghidra'
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
INFO: Recovering local variables (afva)
INFO: Type matching analysis for all functions (aaft)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Enable anal.types.constraint for experimental type propagation
[0x00000000]> s 0x2bd
[0x000002bd]> pdg
WARN: Ghidra Decompiler Error: No function at this offset
[0x000002bd]> aaaa
INFO: Analyze all flags starting with sym. and entry0 (aa)
INFO: Analyze imports (af@@@i)
INFO: Analyze symbols (af@@@s)
INFO: Recovering variables
INFO: Analyze all functions arguments/locals (afva@@@F)
INFO: Analyze function calls (aac)
INFO: find and analyze function preludes (aap)
INFO: Analyze len bytes of instructions for references (aar)
INFO: Finding and parsing C++ vtables (avrr)
INFO: Analyzing methods
INFO: Finding xrefs in noncode section (e anal.in=io.maps.x)
INFO: Emulate functions to find computed references (aaef)
WARN: Bit size 24 not supported
WARN: No SN reg alias for 'r2ghidra'
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
(...)
WARN: Bit size 24 not supported
INFO: Recovering local variables (afva)
INFO: Type matching analysis for all functions (aaft)
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
WARN: Bit size 24 not supported
INFO: Propagate noreturn information (aanr)
INFO: Scanning for strings constructed in code (/azs)
INFO: Enable anal.types.constraint for experimental type propagation
[0x000002bd]> pdg
Do you want to print 30577 lines? (y/N)
brainstorm commented 10 months ago

Memory map for the control firmware file.

Screenshot from 2023-12-08 23-20-45

Repro scripts in https://github.com/brainstorm/threadmill-re/commit/181d19fd708610dfcdb49aa573725fb5b4c8773b ... if I defined the above memory map with would -a r2ghidra pseudo-arch pick it up for better analysis? In other words, are the memory map/regions r2 commands picked up by r2ghidra?

It works quite well on Ghidra, after defining the memory map, a ton of functions make a lot more sense, as expected.

trufae commented 10 months ago

why are you running aaaa? i just did af;pdg . i just fixed the stupid 24bit warning message in master btw

trufae commented 10 months ago

just recompiled latest r2 and latest r2ghidra and tested the same commands you did and it works well

brainstorm commented 10 months ago

just recompiled latest r2 and latest r2ghidra and tested the same commands you did and it works well

recompiled both r2 and r2ghidra and I'm getting the following output, so not quite yet what you got on https://github.com/radareorg/radare2/issues/16498#issuecomment-1531707160:

threadmill-re$ ./r2/anal.sh 
ERROR: Parse error @ line 170 (Invalid register type)
ERROR: Parse error @ line 170 (Invalid register type)
WARN: Cannot derive CC from reg profile
WARN: Missing calling conventions for 'r2ghidra' 64. Deriving it from the regprofile
ERROR: Parse error @ line 170 (Invalid register type)
ERROR: Parse error @ line 170 (Invalid register type)
WARN: set your favourite calling convention in `e anal.cc=?`
Do you want to print 30577 lines? (y/N)

I'll investigate a bit about the calling convention for this stm8 code...

EDIT: Adding an arbitrary e anal.cc=ms generates the same WARN/ERROR messages above, so I guess that calling convention setting does not affect/work for r2ghidra?

trufae commented 10 months ago

i dont know what the script is doing but i see several wrong things before reaching the calling convention issue.

brainstorm commented 9 months ago

I don't know what the script is doing (...)

Not much:

#!/bin/sh
r2 -a r2ghidra -n -i r2/anal.r2 control/flash.bin

Then anal.r2 has:

e asm.cpu=stm8
e anal.strings=1
e anal.hasnext=true
e emu.str=true
e anal.cc=ms
s 0x2bd
af;pdg

And here's what you were asking for so indeed there's no reg profile for stm8:

[0x000002bd]> drp
ERROR: No register profile defined. Try 'dr.'
[0x000002bd]> dr.
[0x000002bd]> 
rpv-tomsk commented 5 months ago

Just fixed some bugs in r2ghidra and its now usable for stm8

Screenshot 2023-05-02 at 17 53 46

I'm interested in stm8 decompilation and I just took a look in this interesting screenshot.

I see no lines matching to 2ca and 2d5-2e0 instructions. Is I'm missing something or decompiled code is wrong?

trufae commented 5 months ago

You are correct. This decompilation looks nice but its wrong. R2ghidra is far from perfect. Not only because of bugs in ghidra, but also because the analysis from r2<>r2ghidra differs

if you want something more reliable but less readable i would go for r2dec (pdd) or pdc.

i am working on a new decmpiler but wont be a thing until next year. I dont think r2dec supports dtm8 but should be easy to extend. And pdc is completely arch independent.

trufae commented 5 months ago

@rpv-tomsk https://github.com/radareorg/radare2/pull/22887 native support for stm8 is now ready to be merged

trufae commented 2 months ago

well that was merged already so closing