NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.06k stars 5.65k forks source link

DWARF Analyzer sometimes fails when Macho - Binary has embedded DWARF sections in it .. Not always #6643

Open vital-information-resource-under-siege opened 1 week ago

vital-information-resource-under-siege commented 1 week ago

Describe the bug For Static analysis purposes, I have objcopy __DWARF section from dSYM file to the macho binary file from ghidra to work with the executable . Because, ghidra has no way to import the DWARF from external file . It worked well for some binaries . But fails for some binaries as well. Steps to reproduce the behavior:

  1. llvm-objdump and llvm-objcopy some sections from dSYM file to macho binary
  2. Open the latest version of Ghidra and auto analyze
  3. For some binaries, An error popus and go to user log to know more about it
  4. (DWARFProgram) Failed to read DIE at offset 0xb in compunit 0 (at 0x0), skipping remainder of compilation unit. java.io.IOException: Unknown DWARFForm 639 (0x27f)

Expected behavior A normal analysis with DWARF information being imported

Environment (please complete the following information):

Additional context Compiler : Apple clang version 15.0.0 (clang-1500.3.9.4)

ryanmkurtz commented 1 week ago

Can you please attach one of your failing samples?

vital-information-resource-under-siege commented 1 week ago

Sorry to deny.It was a propetiary binary giving for security testing . Is there a way to share without leaking the details of the file.

dev747368 commented 1 week ago

I have objcopy __DWARF section from dSYM file to the macho binary file from ghidra to work with the executable . Because, ghidra has no way to import the DWARF from external file . It worked well for some binaries . But fails for some binaries as well.

I need to clear up a few details with you.

Typically, Ghidra can handle external DWARF .dSYM files, and if the .dSYM folder is in the correct location, it should 'just work'. Can you confirm that the binary you imported into Ghidra had a co-located .dSYM folder?

See https://github.com/NationalSecurityAgency/ghidra/blob/53313af55ae2d64f4385b2e856d8ca3e3bc7622b/Ghidra/Features/Base/src/main/java/ghidra/app/util/bin/format/dwarf/sectionprovider/DSymSectionProvider.java#L43-L44

Also, I would be interested in the objcopy commands you used to port the DWARF sections from the .dSYM file to the main binary. Could you copy/paste the actual cmd lines you used (minus the real filename if you need to protect that detail)?

vital-information-resource-under-siege commented 1 week ago

Actually i don't know the specific details but the path for the DSymSectionProvider.java is set under format of DWARF so it basically doesn't search for Mach-O binaries. I believe it only searches for the directory if a dwarf section or a dwarf file I used for analysis. Because i have my mach-o binary in place and already tried copying the dSYM in the same directory the analyzers didn't even show the DWARF.

The commands were

llvm-objdump -h --macho ./dsym_file_name # to list the sections in dsym file first
llvm-objdump --macho --section=__DWARF,__debug_section_names_here ./dsym_file_name > inter_1
xxd -r ./inter_1 > inter_2
dd if=inter_2 bs=1 skip=(the start of section from listing section output)of=inter3 # the section seperated from the binary
llvm-objcopy-14 --macho --add-section=__DWARF,__debug_section_name_here=the_location_of_inter3 ./executable to inject the section
dev747368 commented 4 days ago

DSymSectionProvider.java is set under format of DWARF so it basically doesn't search for Mach-O binaries.

I'm not sure if I'm parsing your reply correctly, but I can assure you that DSymSectionProvider is there to handle mac macho files.

In general, there are many exe formats that can contain dwarf info, so if your .dSYM info wasn't being found, I think figuring out what the problem is with how the files are setup in your env will be the best option, instead of relying on what seems like a painful method of copying the data from the .dsym to the main binary.

The first thing to check is the value of the executable path as it was stored in your Ghidra binary. You can see this value by right clicking on the file in your Ghidra project window and choosing the "About Program" menu item, and then looking for the "Executable Location" value. This is is the location that Ghidra will look for the matching .dSYM/Contents/Resources/DWARF/ folder.