pfalcon / ScratchABit

Easily retargetable and hackable interactive disassembler with IDAPython-compatible plugin API
GNU General Public License v3.0
393 stars 47 forks source link

Improved Thumb mode support #27

Open thesourcerer8 opened 6 years ago

thesourcerer8 commented 6 years ago

Hi tried ScratchABit now again with my test files: http://www2.futureware.at/~philipp/ssd/analyse/EXT0CB6Q.dec.P21.frmw which contain both ARM and Thumb code, and it starts with ARM code at 0x0 and has a Thumb function that starts at 0x574. According to a comment on the Kosagi forum, ScratchABit should support both ARM and thumb now, but I couldn't get it working. This is my .def file:

cpu any_capstone # does not work

cpu arm_32_arm_capstone # works, only ARM

cpu arm_32_capstone # works, only ARM

cpu arm_32_thumb_capstone # only THUMB

cpu arm_thumb # only THUMB

show bytes 4 area .bin 0x00(0x10000) rwx load EXT0CB6Q.dec.P21.frmw 0x0 [entrypoints] start = 0x000

Is there a way to define where ARM and where Thumb code is in interactive mode?

pfalcon commented 6 years ago

As a quick response - thanks for giving it a try and providing a reference binary, we can go from there, let me have a look at it when I have a chance.

According to a comment on the Kosagi forum, ScratchABit should support both ARM and thumb now

Yeah, but that feature is less a week old, I actually would like to do a more testing for it before making a formal release, so things may not work too well, and thanks for helping to test it again.

Anyway, how ARM/Thumb interoperation works, is that if ARM code uses a suitable call instruction (blx) with an odd address (least bit is 1), it switches to Thumb mode. That's how ScratchABit detects ARM vs Thumb code either - by watching immediate call addresses during recursive descent. Of course, indirect call destination aren't known, so all code can't be recovered automatically.

Is there a way to define where ARM and where Thumb code is in interactive mode?

Ack, that's one of the feature which is missing yet - defining Thumb code (ARM is defined by default with "Make code (c)" command. One reason it's not implemented yet is concerns of what UI to use for it. I can quickly think of 2 options:

  1. On pressing "c", pop up a dialog. This will quickly get very boring.
  2. Leave "c" for defining code with primary ISA (e.g. ARM), and use "C" (i.e. Shift+c) for defining alternative ISA (e.g. Thumb).

Let me know how these sound and if you have better ideas.

(One thing which bothers me is what to do when we face an arch with 3 or more alternative ISAs. But I guess, it makes sense to stay realistic and follow YAGNI until it hits ;-). (When it'll require noticeable redesign anyway.))

pfalcon commented 6 years ago

Btw, did you try just to see ARM vs Thumb detection in action with ./ScratchABit.py example-arm_32_thumb1.elf ? That will start showing Thumb main() function, but the binary also contains mix of ARM and Thumb2 in startup code.

(Please pull, there was a bug when starting up when initial function was Thumb.)

thesourcerer8 commented 6 years ago

"c" + "C" sounds good to me the way you described it. Another idea I have is to press "C" several times to switch between various alternative ISAs, when there are more alternatives. I do have trace-files from the actual executions of the code on a SoC, and in those trace files I have always collected ARM/Thumb mode for every instruction, so I can feed that information into the disassemblers and decompilers. I already did that for radare2, you can see the ahb 16 and ahb 32 statements there: http://www2.futureware.at/~philipp/ssd/analyse/EXT0CB6Q.dec.r2 ahb 32 @0x0000053c ahb 16 @0x00000574 For IDA, I can generate .idc scripts: http://www2.futureware.at/~philipp/ssd/disasm.html Could I inject those information in a .def file, or with a Python plugin perhaps?

thesourcerer8 commented 6 years ago

And by the way, thanks a lot for all your work on ScratchABit and ScratchABlock!

pfalcon commented 6 years ago

This is my .def file:

So, even trying it as is, I see that few Thumb functions were detected, e.g. fun_0000f9fe . Btw, I found it using "Memory map" feature (Shift+i), scrolling around. Yeah, memory map feature should be improved, at the very least it should show current address, yet better a scrollbar.

pfalcon commented 6 years ago

Another idea I have is to press "C" several times to switch between various alternative ISAs, when there are more alternatives.

Well, that's not how it works. Pressing "c" marke that address as a code and starts analysis from it. Just imagine, that you can load 1MB of code, with start routine quickly ending with indirect jump. Then you found another code address, and - wonder - all the rest of code is direct jumps/call. Then, pressing "c" will lead to recursive descent analysis of this 1MB. There're bunch of side effects of that, like symbols get defines, functions, etc. You can't really "reanalyze" it in different ISA, as all previous artifacts will stay there.

The closest thing to that would be implementing undo feature, so after you mispressed "c", you can undo and press "C". This reminds me that I wanted to implement undo, yeah ;-). In the meantime, how it works is that you'd need to manually "u"ndefine all mis-disassmbled instructions one by one, and press a correct key. Realistically, that shouldn't be many, but undo would be better of course.

pfalcon commented 6 years ago

or with a Python plugin perhaps?

You definitely can write a Python plugin/script for SABit's native API, and run it with --script switch. The only caveat is that this API isn't stable.

Here's example: https://github.com/pfalcon/xtensa-subjects/blob/master/2.0.0-p20160809/start.sh https://github.com/pfalcon/xtensa-subjects/blob/master/2.0.0-p20160809/proj_init.py

I already did that for radare2, you can see the ahb 16 and ahb 32 statements there:

You could easily convert that to suitable calls for SABit, yeah.

For IDA, I can generate .idc scripts:

And that's the winner. That's how it was intended to be. IDAPython API (which includes (well, included IDC) funcs) was supposed to be a "stable" way to deal with, however unbeautiful that API might be. You just use native Python syntax instead of IDC. Again, here's example: https://github.com/pfalcon/xtensa-subjects/blob/master/2.0.0-p20160809/esp8266_sdk_2_0_0_p20160809_map_script.py

Well, going that way, you'll find that a lot of idc funcs are not implemented. While I'm working on "C" command implementation, feel free to beat me on adding few idc.py funcs ;-).

pfalcon commented 6 years ago

And by the way, thanks a lot for all your work on ScratchABit and ScratchABlock!

Thanks. And I appreciate attention and really need help with catching goofs which may preclude other people from using/enjoying it, and to move it forwards. So, issue reports, criticism, patches, spreading the word is appreciated ;-).

thesourcerer8 commented 6 years ago

I real and properly working undo function would be the greatest feature possible, I think. I guess that it's a bit difficult to pull it off in a somewhat memory efficient way due to all the side-effects every action can trigger. Ok, I added a generator for the idc Python scripts: http://www2.futureware.at/%7Ephilipp/ssd/disasm.html now how do I run it? Analysis -> Run plugin crashes when I enter init.py there.

thesourcerer8 commented 6 years ago

From my experience, when you define something in the wrong instruction set, you usually don't get very far, so it seldomly found other functions that way. But yes, a real undo function would be great for that.

pfalcon commented 6 years ago

This is my .def file:

So, even trying it as is,

Oh, this writing between the lines... Of course I didn't try your .def as is, I fixed it up first. So, the CPU plugin which should be used is:

cpu arm_32_capstone

All other ARM CPU plugins should be considered a debugging aids (well, we'll handle these matters in https://github.com/pfalcon/ScratchABit/issues/29).

Another issue is that your file is 128K, while you have in your .def:

area .bin 0x00(0x10000) rwx

That's 64K, i.e. half-file is lost. This also leads to exception on trying to save database (Shift+s) after initial loading.

Summing up, the correct .def to use is:

cpu arm_32_capstone
show bytes 4
area .bin 0x00(0x20000) rwx
load EXT0CB6Q.dec.P21.frmw 0x0

[entrypoints]
start = 0x000

Let's stick with this one for further testing to be on the same line, or if you fixed the issues above and have further updates on your side, please post the latest version.

pfalcon commented 6 years ago

Ok, "C" command added in b4a390e2003142ada44d94c5cbc31c74b80cbd17, let me know how it goes.

pfalcon commented 6 years ago

Ok, I added a generator for the idc Python scripts: http://www2.futureware.at/%7Ephilipp/ssd/disasm.html now how do I run it? Analysis -> Run plugin crashes when I enter init.py there.

The easiest way to run a script on SABit startup is using --script command line switch, as mentioned above. Afterwards, UI open, you can review results, and save it as usual. Or you can couple --script with --save, to save DB after script application, w/o even opening UI.

thesourcerer8 commented 6 years ago

"C" works fine on undefined bytes, but it does not work on ARM code. Also it does not work on errorneous code "UNEXPECTED value", I always have to undefine all that stuff first. It would be great to be able to mark a whole section of code and be able to undefine or "C" or "c" all that code at once, undefine always only works a single value.

pfalcon commented 6 years ago

I always have to undefine all that stuff first

Yes, that's by design, doing it otherwise would lead to (more) mistakes.

undefine always only works a single value.

Also as expected, working otherwise would again likely lead to more troubles.

"UNEXPECTED value"

These aren't expected to be there. I'm working thru stabilizing the workings of your patch, and afterwards, ways to reproduce such cases would need to be reported to be investigated/resolved.

As was discussed earlier, some (arguably, pretty limited) drawbacks of the handling above will be addresses by implementing undo.

pfalcon commented 6 years ago

I have 2 news: good and bad.

Good: with your latest init.py (size: 1244310, md5sum: 157a4c80860932f3d5f1691d46baa654), from-scratch analysis result can be saved and disasm-written without exceptions.

Bad: this latest init.py still has issues, e.g.:

SetRegEx(0x0000b5e8,"T",0,2) #0x0000b5e8->0x0000bcf8 ARM Supervisor 0x0000b5e8  0xfa0001c2      BLX 0x0000bcf8

SetRegEx(0x0000BCF8,"T",0,2)
MakeFunction(0x0000BCF8,4294967295)
MakeNameEx(0x0000BCF8,"fcn_0000bcf8",1)
SetRegEx(0x0000bcfa,"T",1,2)

It's implausible to have an ARM instruction at 0x0000BCF8, and 2 bytes later, at 0x0000bcfa, to have a Thumb instruction.

And as the first line shows, at 0x0000b5e8, the CPU was in ARM state, and executed BLX 0x0000bcf8 instruction. Per our discussion in the other ticket, it means that the code at 0x0000bcf8 is executed in Thumb mode.

thesourcerer8 commented 6 years ago

Ok, now I hopefully implemented the BLX handling correctly. Please try again.

thesourcerer8 commented 6 years ago

As a user I woud like to have a fast possibility to try various decoding options at any given data address. I would like to try decoding it as ARM, as Thumb, as DWord, as String, ... and easily switch between those options. When I try to decode an address with one of those options, and then directly afterwards at the same address with another of those options, ScratchABit should automatically undo the previous option and try the new option instead, and automatically undefine it in between when necessary. If it was longer before, ScratchABit should perhaps ask me, whether I really want to undefine this, or perhaps also whether I want to undefine it until the end of consecutive similarly defined addresses. That's the usability that I would like to have.

pfalcon commented 6 years ago

Ok, now I hopefully implemented the BLX handling correctly. Please try again.

Better, much better. Still some fine details to take care of.

SetRegEx(0x00015C04,"T",0,2) #0x00015a96->0x00015c04 Thumb Supervisor 0x00015a96 0x4788 <-->BLX<--->r1 r1=0x00015C05

As mentioned elsewhere, BLX has 2 forms: direct-addressing (i.e. immediate offset encoded in instruction) and indirect-addressing (address in a reg).

  1. direct-addressing always flips ARM<->Thumb
  2. But indirect addressing does NOT do unconditional flip. Instead, it's governed by LSB of the reg.

On this case, the instruction is BLX r1, the indirect version. Your trace shows that r1 is 0x00015C05. Consequently, the instruction at 0x00015C04 is a Thumb one.

There're definitely more cases like that - I'm diffing my grep -v filtered version against your latest one, and while there're mostly improvements, there're also some regressions.

thesourcerer8 commented 6 years ago

Ok, perhaps I got it right now, feel free to try again. And thanks a lot of the explanations!

pfalcon commented 6 years ago

and easily switch between those options

You seem to miss that "easy switch" also means "easily switch by mistake". Let me again point that the act of disassembly is a complex act with side effects. Consider that you pressed "c" and 1MB of code got disassembled, 1000 labels was created, etc. 1MB is helluva of code, so next month(s) you spent renaming those labels to make sense, etc. Then you by mistake pressed "C" there. What happens? All the previous labels don't make sense for new code, so either one month of your work goes to /dev/null if tool automatically deletes them, or you're left with irrelevant labels pointing into the middle of new instructions.

2nd point is establishing baseline for what's "easy" and what's "hard". "Easily do disassembly work" is the reason I embarked on writing ScratchABit. Before that, the choice I had is to typing "TLA 0xabcdef12" and pressing Enter in Radare (and you've got to know all those TLAs!). I thought that life's too short for that, and decided that there should be direct-manipulation alternative (additionally written in interpreted VHLL). But that doesn't mean that I'm, exaggerating, writing a tool where a single (mis)press may get your HDD formatted. You must accept that while many commands are a single press with immediately visible result, some require 2 keypresses, and some even legwork of user cursor keys multiple times (or multiple "u" presses).

All that however explains why ScratchABit UX was designed like that, and likely will stay that way. But the reason I wanted it to be written in "interpreted VHLL" was to allow easily hacking on it by as many users as possible and trying wildest ideas. Feel free to hack your way of working in, feel free to try it on different projects, feel free to share results and patches. I can't promise that the core operations will be changes (but happy to hear experience and learn from it), but the aim is to allow users to implement any functionality to plugins, redefine any keys, etc.

pfalcon commented 6 years ago

Ok, perhaps I got it right now, feel free to try again.

Getting there. Reality just gives examples of what I wrote above ;-) :

MakeFunction(0x0000BC72,4294967295)
MakeNameEx(0x0000BC72,"fcn_0000bc72",1)
SetRegEx(0x0000bc70,"T",0,2) #0x0000bc70->0x0000bc74 ARM Supervisor 0x0000bc70  0xe59f2064      LDR r2, [r15, #0x64] r2:0x00000001=>0x7F

SetRegEx(0x0000bc74,"T",0,2) #0x0000bc74->0x0000bc78 ARM Supervisor 0x0000bc74  0xe3510000      CMP r1, #0x0 r1=0x000000FF

As you can see, you make a label/function at 0x0000BC72 which points inside an ARM instruction at 0x0000BC70, ScratchABit doesn't like that:

@@ -40105,8 +39934,11 @@
 0000bc70              ; xref: c 0x9e1e (fcn_00009df0+0x2e)
 0000bc70 fun_0000bc70:
 0000bc70     ldr      r2, [pc, #0x64]
-0000bc70 ; End of function 'fcn_0000bc72' (as detected)
-0000bc74     cmp      r1, 0x0
+0000bc74 ; UNEXPECTED value: 00 flags: 03
+0000bc75 ; UNEXPECTED value: 00 flags: 02
+0000bc75 ; End of function 'fcn_0000bc72' (as detected)
+0000bc76 ; UNEXPECTED value: 51 flags: 02
+0000bc77 ; UNEXPECTED value: e3 flags: 02
 0000bc78     movne    r3, 0x0
 0000bc7c     add      r0, r0, r2
 0000bc80     ldr      r2, [pc, #0x34]
SetRegEx(0x0000FFE6,"T",0,2) #0x0000387a->0x0000ffe4 Thumb Supervisor 0x0000387a  0xf00cebb4    BLX     0x0000ffe6

MakeFunction(0x0000FFE6,4294967295)
MakeNameEx(0x0000FFE6,"fcn_0000ffe6",1)

Can't have ARM instruction at address not divisible by 4.

thesourcerer8 commented 6 years ago

But this would be a case of "direct-addressing always flips ARM<->Thumb". It is jumping away from Thumb mode, uses BLX to a direct address, ... but since it's an odd address, it must be Thumb code. Why is it using BLX instead of BL here? Whatever, I added another filter to change undivisible addresses to Thumb mode, even if they are direct addressing.

pfalcon commented 6 years ago

It is jumping away from Thumb mode, uses BLX to a direct address, ... but since it's an odd address, it must be Thumb code.

I assume you means 2nd case, with 0x0000FFE6. So, at 0x387a, there's a Thumb instruction, which Capstone decodes as "blx fcn_0000ffe4". So, instruction at 0xffe4 must be an ARM instruction.

Neither 0xffe6 nor 0xffe4 is odd address (0xffe5 would be). Actually, as I mentioned in the other ticket, this version of instruction doesn't even store lowest bits of offset: DDI 0406C.c, page 348:

I1 = NOT(J1 EOR S); I2 = NOT(J2 EOR S); imm32 = SignExtend(S:I1:I2:imm10H:imm10L:’00’, 32);

I.e. immediate offset value stored in the instruction is just extended with 2 zero bits on the left (aka shifted by 2 left/multiplied by 4).

"Odd addresses" come into play only if you deal with indirect instructions, with address in a register.

thesourcerer8 commented 6 years ago

It turned out that the root cause for these problems is a bug in the disassembler of OpenOCD, so unfortunately the disassembled instructions in my logfiles are wrong: https://sourceforge.net/p/openocd/tickets/175/ I am trying to filter those wrong instructions out now, please try again.

thesourcerer8 commented 6 years ago

And a patch for OpenOCD is also on the way: http://openocd.zylin.com/#/c/4382/1

thesourcerer8 commented 6 years ago

Hmm, actually I could fix up the addresses myself too. Ok, please try again.

pfalcon commented 6 years ago

Ok, please try again.

Great, I don't see any artifacts caused by init.py. We can take the disassembly produced as a baseline, and now start to look how it can be improved on ScratchABit side. I can see at least a couple of issues, will dig into them as time permits. @thesourcerer8, Feel free to report any issues/possible improvements too, perhaps as a separate tickets, as this is a closed one anyway.

pfalcon commented 6 years ago

this is a closed one anyway.

D'oh, I mixed up with something else ;-).

pfalcon commented 6 years ago

Anyway, continuing here re: issues with init.py, this time, more "serious", like possible problems with your tracing (or maybe openocd issues again):

SetRegEx(0x000146ba,"T",1,2) #0x000146ba->0x0001e5e0 Thumb Supervisor 0x000146ba  0x4788        BLX     r1 r1=0x0001E5E1.

SetRegEx(0x0001E5E0,"T",1,2) #0x000146ba->0x0001e5e0 Thumb Supervisor 0x000146ba  0x4788        BLX     r1 r1=0x0001E5E1.

Treating stuff at 0x0001E5E0 as Thumb instruction doesn't make much sense though, following rendition is much better:

0001e5d4     ldr      pc, [pc, #-4]
0001e5d8     dd      0x8000a4c1
0001e5dc     ldr      pc, [pc, #-4]
0001e5e0 fcn_0001E5E1:
0001e5e0     dd      0x8000a395
0001e5e4     ldr      pc, [pc, #-4]
0001e5e8     dd      0x8000a7bd

As you can see, that part of code are thunks to perform long jumps to the part of code at 0x80000000.

pfalcon commented 6 years ago

Another issue was that there was a bunch of UNEXPECTED flags=0x20, i.e. Thumb instructions marked, but not traced. Well, there's little choice apparently to make such instructions "entrypoints", like you did in your original patch. I feel it a little hacky, but as long as it's conditional and there's a way not to do it, I guess it's fine for now: a12404fa43bb982296372f17a02489616d647d96 .

Witt that applied, the only (relevant) UNEXPECTED left is which is related to 0x0001e5e0 address above.