NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.29k stars 5.84k forks source link

Feature needed to label memory mapped MCU registers #145

Closed rogerclarkmelbourne closed 5 years ago

rogerclarkmelbourne commented 5 years ago

When analyzing an embedded system application binary, for a MicroController (MCU) Ghidra does not seem to have a way to label the hardware registers.

For example on the NXP MK22, most of the hardware registers are mapped to addresses 0x40000000 to 0x40090000, in groups so that not the entire memory map is registers.

But the application code runs from address 0x00000000, and the addresses of the hardware registers are far outside the application binary (program) memory, and hence seem to not be handled by Ghidra

The best solution would probably be a method to assign meaningful names these unhandled memory addresses.

One workaround, which I have not tried yet, make a binhex file with multiple memory segments in it, so that the memory addresses of the hardware, would probably appear as program memory, and the analyzer may realize that these addresses are referenced from the application, and hence are probably data rather than program instructions.

But since most MCU's have multiple blocks of registers, making the binhex file would take some time.

xyzz commented 5 years ago

You can add a new memory block manually after the file is loaded from Tools->Memory Map. You can mark it as volatile since it's hardware registers, and then retype and label the registers at these addresses.

emteere commented 5 years ago

You can also add the definitions in the .pspec file for the processor if it is a general standard for the processor. If it is a common variant, you can create an additional variant for the processor using the .ldefs and .pspec files.

Another possibilty if there are many variants, say the MSP430, is to use the header files from the development environment to parse definitions for the variants. Many times the special registers will be defines's. When the cparser parses files, all defines that are values will be turned into Enums. From within the data type manager, you can apply an enum as a label on an address that represents the value of the enum.

rogerclarkmelbourne commented 5 years ago

@xyzz

Which Tools menu do you mean ?

There isn't a Memory Map option on any of them.

Which window / screen do you mean. Does anything specific have to be selected for this option to appear.

@emteere When I imported the binary application file, there was only the option to select "Language". Is that what you mean ?

If so, I think, they should change the label/name for this, because its not really the language its a actually the CPU/MCU type

xyzz commented 5 years ago

Sorry, I meant Window->Memory Map in CodeBrowser.

rogerclarkmelbourne commented 5 years ago

@xyzz Thanks. I see that now, and I change the memory permissions so that its not writable.

BTW. There seems to be a bug with the memory map names, because if I change the name of the existing map, which was auto created when I imported the application binary, it does not get updated in the Program Tree window

There is also another bug, when I select Edit Label, the existing auto generated label is not shown.

I guess I should raise those 2 as separate issues (bugs)

Also..Is manually updating these labels the only way to do it, using this technique.

I have created a CSV with register address and label, by exporting and processing the reference manual PDF for the processor.

I hoped to be able to run a search and replace on one of the datafiles using this CSV

rogerclarkmelbourne commented 5 years ago

@emteere

BTW. I will also investigate your solution. Thanks...

rogerclarkmelbourne commented 5 years ago

@emteere

I will have a go at making a new processor variant PSPEC file etc

I have the SDK, I'd need to somehow run it though the C preprocessor, in order to create a meaningful list addresses, because like all SDK's its a hierarchy.

Looking though the pspec files in different folders, the most useful reference seems to be the AVR8, specifically the PSPEC file, which defines a lot of symbols, e.g.

    <symbol name="PINF" address="mem:0x20"/>
    <symbol name="PINE" address="mem:0x21"/>
    <symbol name="DDRE" address="mem:0x22"/>
    <symbol name="PORTE" address="mem:0x23"/>

I could generate the same thing for the MK22 using the CSV file I already created from the reference manual.

Then make a new project and try importing using my PSPEC file, and see what happens ;-)

rogerclarkmelbourne commented 5 years ago

BTW. This issue looks very similar, but is for external shared libs

https://github.com/NationalSecurityAgency/ghidra/issues/148

rogerclarkmelbourne commented 5 years ago

I ended up building a new PSPEC file as suggested by @emteere and also updated the ARM.ldefs file to reference the new file etc

Its probably a bit of hack, as I had to put them into the RAM section , as I could not get my new "regs" section to import :-(

But at least the code is now annotated with the register names

saruman9 commented 5 years ago

@rogerclarkmelbourne could you share PSPEC and updated ARM.ldefs? Are you tried use CParser?

rogerclarkmelbourne commented 5 years ago

I can share it, but its only for the NXP K22 processor.

The register addresses would not be correct for any other ARM processor

saruman9 commented 5 years ago

It will be useful for others, who reverses NXP K22 processor, but you can create pull request later, when Ghidra will be open sourced.

saruman9 commented 5 years ago

There seems to be a bug with the memory map names, because if I change the name of the existing map, which was auto created when I imported the application binary, it does not get updated in the Program Tree window

I think, that is not bug. Names in the Program Tree should be independent of the Memory Map names, IMHO.

There is also another bug, when I select Edit Label, the existing auto generated label is not shown.

Could you explain what do you mean, please? Because the existing auto generated label is shown for me.

2019-03-11_115418_700170462

rogerclarkmelbourne commented 5 years ago

OK. I will do a PR when its open sourced.

the PSPEC file works fine, but I think could be improved, because I think I could have perhaps made a separate memory segment for the registers, and marked it as volatile. However when I tried to do that, I could not get it to work, so I changed it back t using the "ram:" segment.

Also, ideally for most embedded ARM Cortex MXUs, like the NXP K22 or the STM32 (or probably most microcontollers), the application is normally stored in read only "Flash" memory, but the default is for the ARM Cortex PSPEC file is presume that the program is running from RAM.

rogerclarkmelbourne commented 5 years ago

In the disassembly, if I right click on a label to a memory address, i.e a variable, the edit box does not get populated with the current default name

image

In the screengrab, I had not assigned a memory block for the variables, but I've now done that, and it doesn't make any difference.

If I edit the label either where its referenced or if I double click to take me to the memory address in RAM, the edit dialog does not get prefilled with the existing name

BTW. I'm running Windows 7 Pro.

rogerclarkmelbourne commented 5 years ago

BTW.

@saruman9

Can you tell me where i can find documentation on the PSPEC file format.

Currently I've made the MK22 PSPEC file by looking at various other processors, and it looks like although the hardware register labels are displaying OK, the RAM memory block is not getting picked up from the PSPEC file.

I think perhaps I have a duplicate name


    <context_set space="ram" first="0x0000" last="0x7FFFF">
      <set name="TMode" val="1" description="0 for ARM 32-bit, 1 for THUMB 16-bit"/>
      <set name="LRset" val="0" description="0 lr reg not set, 1 for LR set, affects BX as a call"/>
    </context_set>
    <tracked_set space="ram" first="0x1fff0000" last="0x2000ffff">
      <set name="spsr" val="0"/>
    </tracked_set>
    <tracked_set space="ram" first="0x4000000" last="0x4007fffe">
      <set name="spsr" val="0"/>
    </tracked_set>  

  </context_data>

I assumed the "ram" is a pre-defined type, because things didn't work when I tried changing this to "FlashMemory" but I have seen other processors PSPEC files where the structure was different to the ARM Cortex, and "RAM" was defined instead of "ram"

Anyway, the bulk of the work was generating the data for the 1300+ hardware registers e,g,
<symbol address="ram:0x40009000" name="MK22_TCD_Source_Address_(DMA_TCD0_SADDR)"/>
<symbol address="ram:0x40009004" name="MK22_TCD_Signed_Source_Address_Offset_(DMA_TCD0_SOFF)"/>
<symbol address="ram:0x40009006" name="MK22_TCD_Transfer_Attributes_(DMA_TCD0_ATTR)"/>
<symbol address="ram:0x40009008" name="MK22_TCD_Minor_Byte_Count_(Minor_Loop_Mapping_Disabled)_(DMA_TCD0_NBYTES_MLNO)"/>
<symbol address="ram:0x40009008" name="MK22_TCD_Signed_Minor_Loop_Offset_(Minor_Loop_Mapping_Enabled_and_Offset_Disabled)_(DMA_TCD0_NBYTES_MLOFFNO)"/>
<symbol address="ram:0x40009008" name="MK22_TCD_Signed_Minor_Loop_Offset_(Minor_Loop_Mapping_and_Offset_Enabled)_(DMA_TCD0_NBYTES_MLOFFYES)"/>
<symbol address="ram:0x4000900C" name="MK22_TCD_Last_Source_Address_Adjustment_(DMA_TCD0_SLAST)"/>
<symbol address="ram:0x40009010" name="MK22_TCD_Destination_Address_(DMA_TCD0_DADDR)"/>
<symbol address="ram:0x40009014" name="MK22_TCD_Signed_Destination_Address_Offset_(DMA_TCD0_DOFF)"/> 
saruman9 commented 5 years ago

Unfortunately, I don't know where we can find detailed documentation about PSPEC file format. I read about Sleigh here:

rogerclarkmelbourne commented 5 years ago

OK. I did take a quick look at the doc's folder but I didn't see anything obvious, so I'll to grep the docs folders and see if I can find anything.

goosenphil commented 5 years ago

@ryanmkurtz Why was this issue closed?

rogerclarkmelbourne commented 5 years ago

The solution is to create a custom PSPEC file and to add a reference to it into the .ldefs file.

e.g. here us is my NXP_K22.pspec and my ammended ARM ldef file where I appended the reference to my new pspec file

Note. I don't know if what I've done is 100% correct but it definitely works

NXP_Mk22.pspec.zip

fanoush commented 5 years ago

@rogerclarkmelbourne do you also have nrf52 pspec by any chance? Will look into F07 smart bracelet that has SDK11 based bootloader so pspec would be handy. if not than I guess I should roll up the sleeves :-) BTW things moved a bit after you looked into fitness trackers few years ago, now there are couple of verified nrf52 models with unsigned firmware that can be updated without taking apart. F07 is another one.

rogerclarkmelbourne commented 5 years ago

@fanoush

No. I only made one for the NXP K22 because I needed to analyse some M22 firmware.

The process I used was to output the whole K22 data sheet as pain text, then recursively use the search results in NotePad++ to gradually refine a text file that just had the information on the registers, before finally using several macros I recorded in Notepad++ to get the data in the format that Ghidra needed.

(When I say recursive I mean cut and paste the search results panel into another empty doc, and then search again to filter more and more each time)

fanoush commented 5 years ago

@rogerclarkmelbourne thanks, that's quite impressive usage of notepad++ :-) Never mind. BTW, nordic has nrf52.svd file as part of SDK which is CMSIS-SVD XML so hopefully that one contains same stuff and can be used for doing something similar via python or powershell. also google gives some svd parsers on github

rogerclarkmelbourne commented 5 years ago

@fanoush

No worries.

I seem to spend a lot of time processing stuff in NotePad++ ;-)