NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.47k stars 5.86k forks source link

SPARC: Functions returning structs are not handled well #6882

Open jrmuizel opened 2 months ago

jrmuizel commented 2 months ago

The following code:

struct F {
        int x;
        int y;
};

struct F f() {
        struct F p;
        return p;
}
int r();

int p() {
        return f().x + r();
}

Compiles with clang-11 -integrated-as -target sparc-unknown-linux-gnu -c test.c to:

        .text
        .file   "test.c"
        .globl  f
        .p2align        2
        .type   f,@function
f:
        save %sp, -96, %sp
        ld [%fp+64], %i0
        jmp %i7+12
        restore
.Lfunc_end0:
        .size   f, .Lfunc_end0-f

        .globl  p
        .p2align        2
        .type   p,@function
p:
        save %sp, -104, %sp
        add %fp, -8, %i0
        call f
        st %i0, [%sp+64]
        unimp 8
        ld [%fp+-8], %i0
        call r
        st %i0, [%fp+-12]
        ld [%fp+-12], %i0
        ret
        restore %i0, %o0, %o0
.Lfunc_end1:
        .size   p, .Lfunc_end1-p

        .ident  "clang version 11.0.0 (https://github.com/llvm/llvm-project 60a25202a7dd1e00067fcfce512086ebf3788537)"
        .section        ".note.GNU-stack"
        .addrsig
        .addrsig_sym f
        .addrsig_sym r

and attached as a binary here: test.zip

https://www.gaisler.com/doc/sparc-abi.pdf has the following to say about functions returning structs: "Whenever a calling function expects a structure, union, or quad-precision return value from the function being called, the compiler generates an unimp (unimplemented) instruction immediately following the delay instruction of the call. The unimp instruction’s immediate field holds the low-order 12 bits of the expected return value’s size (higher bits are masked if the object is larger than 4095 bytes). When prepar- ing to return its value, the called function checks for the presence of the unimp instruction, and it checks that the low-order 12 bits agree with the low-order 12 bits of the size it plans to copy. If all tests pass, the function copies the value and returns to %i7+12, skipping the call instruction, the delay instruction, and the unimp instruction."

Ghidra is confused by both the unimp 8 (it thinks control flow terminates) and jmp %i7+12 (it thinks this is an indirect call of sorts)

mumbel commented 1 month ago

I have been hitting this too (thanks for the abi pdf!) and had no clue what those were and did notice random looking access after the return, I had just been converting the illtrap to nops :)

emteere commented 1 month ago

There are a few sparc changes coming out soon. The tagging of the "jmpl %i7+12" default as a return can be added.

You can override the FallThru address of the call to f() to the ld instruction address below the unimpl. This will fix the decompiler/analysis flow. Detecting this case can be automated once the hidden return is working correctly.

There are still some issues with handling the hidden return. The caller is passing the hidden parameter in %i0, but the callee seems to be reaching back into the callers stack to access the hidden pointer param return the value. So the hidden return isn't accessed from the same place.

mumbel commented 1 month ago

@emteere just saw your commits. do you know if anyone has taken a look at/reviewed https://github.com/NationalSecurityAgency/ghidra/pull/6346

jrmuizel commented 1 month ago

@emteere I just tried out 11.2 which I believe has your recent SPARC changes and this doesn't seem to be fixed yet. Is that expected? What still needs to be done?

emteere commented 1 month ago

The return with "jmpl i7+0xc" is now a return. The handling of the structure hidden return is handled once you apply the types.

The insertion of the size of the structure after the Call is not fixed yet.
You still have to override the fallthrough. In your example, if you override the fallthru for the call @1018 to 10024, the example will work correctly once you apply the structure to the return value of the function.

So 2 out of 3 fixed in 11.2.

A context register could be used to denote that a call has occurred in the instruction preceding the illtrap instruction. This could work out ok, but it assumes that the illtrap isn't really supposed to be a trap after a call. However, the call could be a non-returning function, such as exit, and the compiler inserted an illtrap after the call just in case, or it could be simply data after the call.

Automatically overriding the fallthru of the CALL to go around the illtrap instruction could be done in an analyzer as well. It would have to be done carefully. For each function that returns abnormally +8, or +12, all calls would be overriden to fallthru.

I did originally attempt to add a callfix that would inject a branch after the call to +4, but the injection code won't allow branches in an injection if there is code after the call site.