NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.41k stars 5.86k forks source link

Wrong call/jump address calculation in x86 real mode #4074

Open kevinferrare opened 2 years ago

kevinferrare commented 2 years ago

Describe the bug Even when setting the memory map to have the correct segments, ghidra seems to miscalculate the target address of some calls / jumps. More specifically, in case of near calls / jumps, target is supposed to be calculated like so in pseudo code: target absolute address = instruction segment*0x10 + (ushort)(next instruction ip + offset encoded in instruction)

However, it seems that ghidra does this: target absolute address = instruction segment ignoring memory map*0x10 + (ushort)(next instruction ip + offset encoded in instruction)

See screenshot for a concrete case

To Reproduce Steps to reproduce the behavior:

  1. Download file https://drive.google.com/file/d/1AJQHVa0VPgDKEloV7pZNSmIBO7vZdgk_/view?usp=sharing and import it into ghidra
  2. Go to address 1ED0
  3. Look at instruction at 01ed:0003 and see that the call targets the wrong address

Expected behavior Offset encoded into instruction at address 01ed:0003 is 0xE4A7. Segment is 1ED (I defined it in the memory map).

Correct target for call: 1ED : (ushort)(6 + 0xE4A7) = 1ED:E4AD => 1037D Ghidra calculated target: 037D (probably because it doesn't take the memory map segment into account)

Screenshots Instruction: image

Memory map: image

Attachments If applicable, please attach any files that caused problems or log files generated by the software.

Environment (please complete the following information):

JorisVanEijden commented 3 months ago

Such a shame, this really prevents Ghidra from being useful for analyzing old DOS games :(

Another example:

       51fd:03d5 8b  d1           MOV        DX,CX
       51fd:03d7 83  e2  0f       AND        DX,0xf
       51fd:03da 8b  da           MOV        BX,DX
       51fd:03dc 83  fb  0f       CMP        BX,0xf
       51fd:03df 77  62           JA         LAB_51fd_0443
       51fd:03e1 d1  e3           SHL        BX,0x1
       51fd:03e3 2e  ff  a7       JMP        word ptr CS:[BX + 0x46c ]
                 6c  04

It's just a simple switch with the jump-table at 51fd:046c

       51fd:046c 48  04           addr       LAB_51fd_0448
       51fd:046e 1c  04           addr       LAB_51fd_041c
       51fd:0470 3a  04           addr       LAB_51fd_043a
...
       51fd:0488 43  04           addr       LAB_51fd_0443
       51fd:048a e8  03           addr       LAB_51fd_03e8

But Ghidra is convinced that the JMP always goes to 5c16:0606, making a mess of the analysis and decompilation.

kevinferrare commented 3 months ago

As a palliative, there is the spice86 ghidra plugin that imports the runtime data and among other fixes overwrites the references in ghidra: https://github.com/OpenRakis/spice86-ghidra-plugin