NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.09k stars 5.65k forks source link

Implementing indirect instruction execution in processor plugin #6611

Closed shuffle2 closed 1 month ago

shuffle2 commented 1 month ago

The Andes V3 instruction set has an extension named EX9.IT. EX9.IT extension adds a special register ITB (Instruction_Table Base) which points to a memory location which will be treated as a 512-entry array of 32bit values. The EX9.IT instruction takes an immediate value which is used to index into this table. 32bits from the given index are fetched, and a single instruction is executed (if the 32bit data begins with a 16bit instruction, only 1 instruction is executed). The feature is designed to allow executing a 32bit instruction from a 16bit one (EX9.IT is 16bit insn), the idea being that the most frequently used 32bit instructions can be extracted from the main executable and put into Instruction_Table, resulting in overall binary size decrease.

Usage looks something like:

# set ITB=0x4040
MOVI      a0, 0x4040
MTUSR     a0, ITB
...
# execute 1 instruction from Instruction_Table[0x10] inline
EX9.IT    0x10
...

Enlightening a disassembler about this has 2 annoyances:

  1. ITB register value must be tracked
  2. Upon EX9.IT disassembly, parser must be redirected to handle value fetched from ITB, while still considering the resulting instruction as "inline".

1 seems a common problem, and can be handled in ghidra similar to tracking other global pointers. 2 Seems more tricky, I'm not sure how to tackle this in sleigh. What is a way to handle this?