NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.06k stars 5.65k forks source link

Breakpoints are incorrectly placed for i386 and x86_64 targets in qemu+gdb debugger #6662

Open electricworry opened 5 days ago

electricworry commented 5 days ago

Describe the bug When using the qemu+gdb debugger profile for i386 and x86_64 targets, breakpoints are not placed correctly. For example, in a hello world dynamic binary that I just built for i386, attempting to place a breakpoint on main places one at 0x119d but the actual placement of main is 0x4000119d.

The python plugin to gdb is correctly querying the sections, and what it receives is:

(gdb) maintenance info sections -all-objects
Exec file: `/home/electricworry/projects/debugger/i386', file type elf32-i386.
 [0]      0x0194->0x01a7 at 0x00000194: .interp ALLOC LOAD READONLY DATA HAS_CONTENTS
 [1]      0x01a8->0x01cc at 0x000001a8: .note.gnu.build-id ALLOC LOAD READONLY DATA HAS_CONTENTS
 [2]      0x01cc->0x01ec at 0x000001cc: .note.ABI-tag ALLOC LOAD READONLY DATA HAS_CONTENTS
 [3]      0x01ec->0x020c at 0x000001ec: .gnu.hash ALLOC LOAD READONLY DATA HAS_CONTENTS
 [4]      0x020c->0x028c at 0x0000020c: .dynsym ALLOC LOAD READONLY DATA HAS_CONTENTS
 [5]      0x028c->0x0332 at 0x0000028c: .dynstr ALLOC LOAD READONLY DATA HAS_CONTENTS
 [6]      0x0332->0x0342 at 0x00000332: .gnu.version ALLOC LOAD READONLY DATA HAS_CONTENTS
 [7]      0x0344->0x0384 at 0x00000344: .gnu.version_r ALLOC LOAD READONLY DATA HAS_CONTENTS
 [8]      0x0384->0x03c4 at 0x00000384: .rel.dyn ALLOC LOAD READONLY DATA HAS_CONTENTS
 [9]      0x03c4->0x03d4 at 0x000003c4: .rel.plt ALLOC LOAD READONLY DATA HAS_CONTENTS
 [10]     0x1000->0x1024 at 0x00001000: .init ALLOC LOAD READONLY CODE HAS_CONTENTS
 [11]     0x1030->0x1060 at 0x00001030: .plt ALLOC LOAD READONLY CODE HAS_CONTENTS
 [12]     0x1060->0x1068 at 0x00001060: .plt.got ALLOC LOAD READONLY CODE HAS_CONTENTS
 [13]     0x40001070->0x400011dd at 0x00001070: .text ALLOC LOAD READONLY CODE HAS_CONTENTS
 [14]     0x11e0->0x11f8 at 0x000011e0: .fini ALLOC LOAD READONLY CODE HAS_CONTENTS
 [15]     0x2000->0x2016 at 0x00002000: .rodata ALLOC LOAD READONLY DATA HAS_CONTENTS
 [16]     0x2018->0x204c at 0x00002018: .eh_frame_hdr ALLOC LOAD READONLY DATA HAS_CONTENTS
 [17]     0x204c->0x20fc at 0x0000204c: .eh_frame ALLOC LOAD READONLY DATA HAS_CONTENTS
 [18]     0x3ed8->0x3edc at 0x00002ed8: .init_array ALLOC LOAD DATA HAS_CONTENTS
 [19]     0x3edc->0x3ee0 at 0x00002edc: .fini_array ALLOC LOAD DATA HAS_CONTENTS
 [20]     0x3ee0->0x3fd8 at 0x00002ee0: .dynamic ALLOC LOAD DATA HAS_CONTENTS
 [21]     0x3fd8->0x4000 at 0x00002fd8: .got ALLOC LOAD DATA HAS_CONTENTS
 [22]     0x40004000->0x40004008 at 0x00003000: .data ALLOC LOAD DATA HAS_CONTENTS
 [23]     0x40004008->0x4000400c at 0x00003008: .bss ALLOC
 [24]     0x0000->0x002b at 0x00003008: .comment READONLY HAS_CONTENTS

This information gets as far as the "Map sections" view in Modules: image

However, breakpoints are not being placed correctly: image

To Reproduce Steps to reproduce the behavior:

  1. Make a dynamic hello world:
    
    gcc -m32 -x c -o i386 - <<EOF
    #include <stdio.h>

int main (int argc, char *argv[]) { printf("hello, world!\n"); return 0; } EOF

2. Import i386 to Ghidra project
3. Open i386 in the Debugger tool
4. Parameters:
    * Image: path to i386
    * Arguments: NONE
    * QEMU command qemu-i386
    * QEMU Port: 12345
    * Extra qemu arguments: NONE
    * gdb command: gdb-multiarch
6. Wait for module mapping to complete
7. In Modules window, right-click on i386 and select 'Refresh all modules and all sections'
8. Again, right-click on i386 and select 'Map sections'. Confirm that the .text section is has a dynamic base much higher than the other sections, e.g. 40001070
9. Place cursor on main and press 'k' to create a new breakpoint
10. Observe in Breakpoints window that the breakpoint is not placed in the .text section. Furthermore, in the gdb terminal, confirm that the breakpoint doesn't match the location of main:

(gdb) info break Num Type Disp Enb Address What 1 catchpoint keep y syscalls "break, brk, mmap, munmap, mprotect, msync, mlock, munlock, mlockall, munlockall, mremap, mmap2, mincore, madvise, remap_fil e_pages, mbind, get_mempolicy, set_mempolicy, migrate_pages, move_pages" silent hooks-ghidra event-memory cont 2 breakpoint keep y 0x0000119d (gdb) print main $1 = {<text variable, no debug info>} 0x4000119d



**Expected behavior**
I would hope that the translation from binary offset to process memory location would take into account the placement of .text and place the breakpoint at the correct address.

**Environment (please complete the following information):**
 - OS: Ubuntu 22.04.3
 - Java Version: OpenJDK 21.0.3
 - Ghidra Version: 11.1.1
 - Ghidra Origin: official GitHub distro
electricworry commented 5 days ago

I started to look at the Ghidra code to see if I could fix this myself, but I'm not knowledgeable enough with the project. If someone wants to point me to where in the java files the breakpoint address translation is done I would be happy to have a stab at it.

d-millar commented 5 days ago

@electricworry Taking a peak now, but I think the issue is in the Module mapping logic, rather than the breakpoint logic. All of my dynamic memory is greyed out, i.e. is stale, even on refresh suggesting the mapping logic failed.

d-millar commented 5 days ago

@electricworry OK, I have an interim workaround, but something is definitely broken here - appreciate the heads up. The workaround: in the Debugger Tool, open Window->Debugger->Static Mappings. Repeat all of the steps you detailed above through "(6) Refresh all modules and all sections". Step 6 will add a mapping from 00000000 to 00011090 (or wherever your static .text block is based. In StaticMappings, delete this block. Now, in Modules, right-click on the .text section and choose "Map Section to i386:.text". This should result in a new StaticMapping from a high-valued Dynamic Address to the .text base address. Hopefully, with this in place, hitting 'k' in the Static Listing should do the right thing.

Still working on identifying the root error - will keep you posted.

electricworry commented 5 days ago

I can confirm that if I delete the static mapping entry that is created by default, and then:

Then the breakpoint is created at the correct address.

d-millar commented 5 days ago

Cool. As I see it, there are two issues here - one we probably cannot correct and one we should. The first issue is that the dynamic range from 0x1b4-high_rebased_address is getting mapped (somewhat arbitrarily) to the static range starting at the 0x11090. Our current logic assumes the low addresses from each range will be matching sections, which is not the case here. We might be able to match by section name, but I think we'd still be guesing in many cases. Not sure we can come up with a sane fix for this.

The second issue is more insidious and ought to be fixed - namely, at some point, that erroneous mapping gets immortalized and nothing can be done to fix the error. I.e. at some point, neither "Map Sections" nor "Map Section to" will solve the problem. We need to do more digging on this.

electricworry commented 5 days ago

Thanks for finding the workaround.

Regarding your dilemma, it feels like Ghidra is doing the efficient thing that works in most cases (mapping the whole binary to the low address) but that it's perhaps a bit naive as indicated by the counter-example provided.

I don't know what the most elegant solution would be, but it seems to me that the information is there - i.e. a few clicks generates the correct Static Mappings entries from QEMU-GDB - so could we have that happen programatically in replacement of the current logic? (Or perhaps it could be a boolean setting the user could turn on?)

To be clear, in performing the workaround I did not need to make any decisions about what sections to map. I simply ran "Refresh all modules and all sections" followed by "Map Sections" (i.e. not "Map Sections to i386") and then hit OK agreeing to all twenty-something mappings. Which seems to indicate a dumb programatical solution is possible.

d-millar commented 5 days ago

Agreed in principal. That said, retrieving sections is by far the most likely operation to fail in gdb. Many gdb stubs do not support it at all, so we would need it craft a solution that did not generally rely on that information.

nsadeveloper789 commented 5 days ago

I think this is the first I've seen a non-linear mapping of sections within the same module, i.e., one static mapping cannot properly encompass all sections within. We do have a solution for that, albeit much less exercised. In the Modules panel, there is a drop-down toward the right in its toolbar called "Auto-Map". There are five options in there: 1) By Module (default), 2) By Section, 3) By Region, 4) Identically, 5) Don't --- not necessarily in that order. As @d-millar has pointed out though, "Map By Section" cannot work until section information is available, so you will probably have to query those in the Model tree, manually, in order to activate the auto-mapping.

You might try "By Region". I don't recall whether the heuristics there assumes regions have a constant image offset per module.... I think it does :/

That auto-map option is saved to the tool, so if you're commonly working with qemu-i386, setting that option to "By Section" will probably (probably) work well. If you're switching among different targets with different mapping configs, you might just set it to "Don't" and perform the mapping manually. At least that'll save you from having to delete an erroneous mapping entry. There are ways to script this, too, allowing you to bind a key to perform a target-specific mapping.

All that said, the manual mapping actions should have overwritten the erroneous one. That is a bug.

electricworry commented 4 days ago

Thank you for your help. I gave all of the Auto-Map options a try, but unfortunately none of them add any improvement that I can see; they all create the problematic static mapping. Even the Do Not Auto-Map option still has the static mapping created.

nsadeveloper789 commented 3 days ago

Is this after re-launching with each auto-map option? The complete steps:

  1. Kill the target, if you already have one running.
  2. Change the Auto-Map option to By Sections.
  3. Re-launch your target.
  4. Probably, use the Model tree to load the section info.
  5. Check the static mappings and verify breakpoint placement.

Changing the auto-map option on an existing target would still require you to delete the problematic mapping (because of the bug), and auto-map won't get triggered until Ghidra thinks the memory map has changed (e.g., on loading a module).

electricworry commented 3 days ago

Correct, that's after re-launching with each auto-map option.

Does step 4 mean in the Module window, right-click and select "Refresh all modules and all sections"? If so my steps are:

  1. Set "Auto-Map by Section" (for example)
  2. Close everything.
  3. Start ghidra, and open 'i386' in the debugger. At this point there is no static mapping and no breakpoint, and setting from (1) is still in effect.
  4. Launch the target/debugger
  5. "Waiting for module mapping" is displayed in debug console for about 5 seconds. Then one static mapping appears.
  6. "Refresh all modules and all sections"
  7. Still one static mapping. Creating a breakpoint is on wrong address.

At step 5 if I cancel the action "Waiting for module mapping" in the debug console, then the static mapping will not be created. However, then performing step 6 still does not create any mappings; instead I need to perform the "Map sections" action manually.

Since changing Auto-Map to any setting always has the same outcome (it creates the problematic static mapping) it seems like it's hardcoded (at least in the qemu+gdb case) always perform Auto-Map by Module.

nsadeveloper789 commented 2 days ago

Ah! I know what's going on. There's logic in the launcher, if no mapping to the current program database appears after a timeout, it will attempt a Map by Module, regardless of the UI setting. I'll put an internal ticket in. The bigger concern is still that the manual map actions don't remove the bogus entry, but this is a nuisance, too.

nsadeveloper789 commented 2 days ago

Thanks for pointing it out. You were right about the hardcoding. I think I'll have the launchers all adhere to the Auto-Map setting.