NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
52.13k stars 5.91k forks source link

Decompiler: display strings when pointing to somewhere in a C string #1502

Closed fridtjof closed 4 years ago

fridtjof commented 4 years ago

Is your feature request related to a problem? Please describe. I am reversing a binary right now, that has a lot of trace/debug logging with file names, function names and line numbers in place. It is not really pleasant to work with, because the file names are taken from the end of the full path like this:

fprintf(stderr,"[D]: [%d-%d-%d %d:%d:%d] %lu %s %s %d\t",(date params),pVar3,0x851895,"destroy_window",0x4e);

Notice the address "0x851895" before "destroy_window" - If you double click it, you will be taken to address 0x851870, where a string containing a full path is located like this:

"/hdd/hdd3/xxxx/work/xx/MusicApp/core/window.c"
                                      ^ 0x851985 points here

Describe the solution you'd like Ideally, I would love the decompiler to just display the string starting at wherever the pointer points to, if a string is defined around that location like this:

fprintf(stderr,"[D]: [%d-%d-%d %d:%d:%d] %lu %s %s %d\t",(date params),pVar3 ,"window.c","destroy_window",0x4e);

Describe alternatives you've considered Alternatively, the string could be displayed inline as an offset from the containing string's start like this:

fprintf(stderr,"[D]: [%d-%d-%d %d:%d:%d] %lu %s %s %d\t",(date params),pVar3 ,s_hdd_hdd3_etc_0x851870 + 0x25 + ,"destroy_window",0x4e);
astrelsky commented 4 years ago

The string needs to be set as a constant. Right click on the string at the address, go to settings and you can change it there.

fridtjof commented 4 years ago

That works only when the address points to the start of the string, not right in the middle of it. I know about having to make strings constant sometimes, but that's not the issue here I'm afraid.

(Nonetheless I have tried it just to be sure - doesn't work)

astrelsky commented 4 years ago

That works only when the address points to the start of the string, not right in the middle of it. I know about having to make strings constant sometimes, but that's not the issue here I'm afraid.

(Nonetheless I have tried it just to be sure - doesn't work)

Oh. My apologies I misunderstood.

dev747368 commented 4 years ago

A confounding factor is that the called function is var-args. If it had explicit params, with a char *, the string off-cut display would do what you wanted.

astrelsky commented 4 years ago

A confounding factor is that the called function is var-args. If it had explicit params, with a char *, the string off-cut display would do what you wanted.

@fridtjof try creating a function signature override for the call with the exact arguments being passed.

I have a ghidra script that automates this for printf. I could modify it to handle fprintf if necessary. https://github.com/astrelsky/ghidra_scripts

fridtjof commented 4 years ago

Ah, nice! That worked.

I do wish Ghidra was a bit more "batteries included", in that regard - it does not try to be as "smart" as IDA, which allows it to be more general, I guess.

@astrelsky I took a look at your script, and that looks really awesome, thanks! I think making it generic to handle a given list of functions is the way to go - Even better, have that list stored in a project, so the user can "mark" all printf-style functions in a binary ;)

I guess this feature request is now basically "integrate what this script does into Ghidra" :D

astrelsky commented 4 years ago

Ah, nice! That worked.

I do wish Ghidra was a bit more "batteries included", in that regard - it does not try to be as "smart" as IDA, which allows it to be more general, I guess.

@astrelsky I took a look at your script, and that looks really awesome, thanks! I think making it generic to handle a given list of functions is the way to go - Even better, have that list stored in a project, so the user can "mark" all printf-style functions in a binary ;)

I guess this feature request is now basically "integrate what this script does into Ghidra" :D

I can do that. It'll need to make a few assumptions though.

I'm not sure integrating something that generates a ton of function signature overrides or something as specific as targeting specific function calls is a good idea. At least not at this time. I'm not sure of what impact having all those overrides has on ghidra. Another reason is the warning displayed when creating an override. You have to be completely sure the override is correct or you can end up hiding information from the user or hinder the type propagation.

fridtjof commented 4 years ago

That might be less of an issue if it is disabled by default, and has to be knowingly turned on by the user. I'd suggest making it show up in the "Analysis Options" dialog, with an appropriate description:

analysis_options
caheckman commented 4 years ago

Adding some logic to let the decompiler infer that a constant is a pointer if it is in the middle of a known string. This should automatically address the original situation without having to lay down overrides.

caheckman commented 4 years ago

This change is in the master branch now.