Open Ruturaj4 opened 4 years ago
Was the DWARF analyzer enabled during analysis, and were there any messages in the log about issues with DWARF?
Thanks for your comment. I can't see any errors in gui. The only error I can see is this (in cli).
INFO Read DWARF debug string table, 0 bytes. (DWARFProgram)
INFO DWARF import - total elapsed: 25ms (DWARFImportSummary)
INFO DWARF data type import - elapsed: 9ms (DWARFImportSummary)
INFO DWARF func & symbol import - elapsed: 16ms (DWARFImportSummary)
INFO DWARF types imported: 2 (DWARFImportSummary)
INFO DWARF function signatures added: 1 (DWARFImportSummary)
DWARF local variable info is problematic for Ghidra and its kind of hit-and-miss if we can use it.
However, it does appear you are getting some info, ie. data types and function signatures.
One of the options of the DWARF analyzer is to mark up the imported items with the DWARF DIE record number. If you turn that on, you should be able have more of an indication of which items were successfully pulled into Ghidra.
Thanks for you replay.
No, I don't think I am getting any information from dwarf. I checked the output using stripped binary and ghidra gives me exact same output. I kept .symtab section before and stripped everything else, thus you may see function signature information.
Could please tell me what is that option? I turned everything on, but no effect on the binary
"Output DWARF DIE info". You should see the DIE info tagged on the data type's comment field and a pre-comment on functions if the DWARF analyzer created that entry.
Could you post the binary, or at least the entire contents of the readelf output?
Thanks so much. But it doesn't show up in my case. I checked IDA pro and it detects the information correctly.
Please check the attached binaries. This binaries are for the different program though. Benchmark - sard88 - 283 test.
I attached three binaries (compressed form). Note that these binaries contain buffer overflow.
obo_bad.o - binary compiled with -g flag (gcc -g
)
obo_bad_debin.o - binary is stripped (keeping .symtab section - i.e. -g flag strip -g ./bin
) and then .debug section is recovered using debin
obo_bad_stripped - stripped off all the information (strip ./bin
)
If you compare obo_bad_stripped and obo_bad_debin, you can't see much difference in the output. You can observe that the variable names are not being detected correctly.
Thanks for the quick turnaround.
So, I am getting DIE info tagged on functions for your obo_bad_debin, and some data types.
Like you, I am not getting local variables, and it comes down to the way the location of the local variable was encoded in the dwarf location expression attached to each variable definition.
Background info for those that don't know, DWARF defines a small embedded stack-based expression language. For each thing that has a location, the DWARF spec allows that location to be defined using instructions in that expression language. Ghidra can evaluate only a sub-set of that expression language because some of the operations are using live values from CPU registers.
In some cases we can map those register-referencing operations to Ghidra native definitions (ie. if the register was the stack register, and it was a simple offset from the register), and in some cases we can't.
In your case, debin is preferring to use operations that are relative to rbp:
<2><df>: Abbrev Number: 4 (DW_TAG_variable)
<e0> DW_AT_name : s
<e2> DW_AT_location : 2 byte block: 76 68 (DW_OP_breg6 (rbp): -24)
<e5> DW_AT_type : <0x2e>
Which is pretty close to being the stack register, but we're not currently handling it.
Here is a normal gcc generated local variable, which using fbreg:
<2><6a>: Abbrev Number: 4 (DW_TAG_variable)
<6b> DW_AT_name : (indirect string, offset: 0x81): init_value
<6f> DW_AT_decl_file : 1
<70> DW_AT_decl_line : 3
<71> DW_AT_type : <0xa3>
<75> DW_AT_location : 2 byte block: 91 68 (DW_OP_fbreg: -24)
I'm going to rename this issue to "Support DWARF location expressions using BP" and mark it as an enhancement. Hopefully you are ok with that.
Great! Thanks so much for quickly addressing this. I kept this issue open for now, you may close it as per your procedure.
I am using debin project to recover symbols in stripped binaries. This project leverages machine learning approach to reverse engineer variables, types and variable names from stripped binaries. It also rebuilds the stripped section (.debug) so that it will be easier for the reverse engineering frameworks to leverage this information to improve the analysis.
But, I observed that even debin successfully builds some of the symbols, ghidra ignores these symbols during analysis. Is there a particular reason for tha? and is there any way to force ghidra to use such symbols (in gui as well as in cli).
for e.g. I have following code (ref: sard 89 benchmark - 000/000/151):
readelf -wi
output on debin binary (symbols are generated by debin):Ghidra GUI:
Thanks in advanced.
Debin paper ref: https://dl.acm.org/doi/pdf/10.1145/3360572