NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.13k stars 5.74k forks source link

Need help writing scrip 9s12 #1570

Closed badassloumd closed 2 years ago

badassloumd commented 4 years ago

Ok -- getting to know Ghidra, (old IDA user). I have an issue with the CPU that I am working with (9s12) where allot of my the references to ram addresses were disassembled pointing to the wrong region of memory (CPU has a paged window from 0x8000->0xB000 that code can run in, and that code maps to physical addresses anywhere from FF:8000 -> FFB000 to E1:8000 -> E1:B000) The mapped in page is set by the ppgae register - but all ram accesses (regardless of the ppage) are always at 0x2000 -> 0x4000. Ghidra made the assumption (probably something I did wrong) that all ram accesses have the page pre-pended to the address. So, for example, if a branch is made to 0x FE:8000 and at that location there is a LDAA 0x2000 it creates that load reference as FE2000 -- which is wrong.

for example:

784470 7d 29 86 STY offset DAT_0fe986
should really be: 784470 7d 29 86 STY DAT_002986

I can do this manually, by hitting the R key, and replacing the FE with the value 2, but i really want to write a script for that.

I want to write a script to examine all of the references, and fix them up. can someone get me started in the right direction?

ghost commented 4 years ago

Walkthrough the ghidra classes and api sometime when you can; it helps: https://ghidra.re/courses/GhidraClass/ https://ghidra.re/ghidra_docs/api/index.html

Iterating "references to" an address and renaming: `
for r in getReferencesTo(target_address):

check reference types in your case

if r.getReferenceType().isData():
    # creating a "new" label at an address
createLabel(r.getFromAddress(), "DAT_xxxxx", true)

`

commenting at an address codeUnit = listing.getCodeUnitAt(target_address) codeUnit.setComment(codeUnit.PRE_COMMENT, comment)

Code above is not really "accurate" you can look up locations and imports via the ghidra-api.

I'm hinting more so that the scripting api is much like IDA's. You should have no problems getting used to it.

One way to do what you are asking is, if you feel like, it regex out DATXX and keep the remaining "correct" address locations then create a new label and insert in the right base and the remaining "correct" address locations.

GhidorahRex commented 4 years ago

Looking at the instruction that you sent, Ghidra certainly THINKS it should be reading address 0xfe986 based on the value of the RPAGE register, which is specific to the HCS12X, rather than HCS12. There's other additional issues (such as the use of GPAGE throughout) that are causing problems as well.

emteere commented 4 years ago

As @GhidorahRex mentioned there should be an HCS12, and HCS12X to support the PPAGE only mapping of the HCS12. The page at 0x8000-0xbfff would need to be modeled as an overlay most likely, or it could be mapped into higher fake memory. Not sure which is easier.

I thought I had a hack for you setting GPAGE=0 and then set UseGPAGE=1. However the sleigh sets the UseGPAGE during decoding.

The HCS12 also might use INITRM, INITRG, or INITEE, which remap the RAM, REG, and EEPROM, which are similar to the RPAGE, and EPAGE, but not quite as complicated. It might be that most code doesn't use these, and they don't need to be modeled, or they could be initialized to zero and then would be constant.

We'll check into simplifying/correcting the memory accesses. The memory accesses should be much simplified and be 16-bit for the HCS12, although they still could remap to registers.

RyanHope commented 11 months ago

Is there a solution to this? I am having the same issue on a S912X mcu. I have all these jumps to locations 0x39xxxx and setting the ppage register does nothing to change this.

          7e1007 06 92 3b        JMP        offset LAB_3f923b
GhidorahRex commented 8 months ago

@RyanHope: I found the issue here. The address calculation for the PPAGE address offset is missing the fixed bit 22. It should have been included.

The fix is to modify line 339 of HCS_HC12.sinc (line 336 for the EPAGE register): addr = addr | ((zext(page) << shift) | offset);

I'll try to get the fix in soon.

badassloumd commented 8 months ago

Thank you, that would be awesome!