aldelaro5 / ghidra-gekko-broadway-lang

Ghidra language definition for the Gekko and Broadway CPU variant used in the Nintendo GameCube and Nintendo Wii respectively
Other
127 stars 16 forks source link

Incorrect handling of 8-byte parameters to functions #8

Open adituv opened 4 years ago

adituv commented 4 years ago

8-byte function parameters can be passed split into two registers, rather than on the stack as this currently does. However, it seems only "odd-even pairs" can do this - a value can be split across e.g. r3 and r4, or r5 and r6, but not r4 and r5. (For examples, see the OSSetAlarm and InsertAlarm functions)

I tried to implement this change in the cspec file, similar to the way the 8-byte output is defined, but was unsuccessful - for example, a function with parameters (u32, u64, u32) will have the u32 in r3, then the u64 in r3 and r4.

This may be a Ghidra limitation - the cspec documentation states:

The first <pentry> from the resource list that fits the datatype and hasn't been fully used by previous datatypes is assigned to that datatype.

As r4 should be left unused in this case, that seems like it may be impossible to fix without a change to Ghidra.

adituv commented 4 years ago

About the register allocation for 8-byte parameters, I have tracked it down in documentation.

The PPC EABI document does not directly specify it, but leaves it unchanged from the System V ABI. The algorithm for parameter allocation can be found on page 3-19 of the System V document, which confirms that the starting register for a long long (64-bit value) if supported should be an odd-numbered register.

aboood40091 commented 3 years ago

Any update on this?

DarkKirb commented 3 years ago

While it would be possible to handle, it would harm the general case where ghidra assumes the calling convention of a function automatically. if that isn't a problem for you you can add the following to ppc_gekko_broadway.cspec 's \ tag

EDIT: the following also does not work in most cases because ghidra allocates overlapping registers

<pentry minsize="5" maxsize="8">
  <addr space="join" piece1="r3" piece2="r4" />
</pentry>
<pentry minsize="5" maxsize="8">
  <addr space="join" piece1="r5" piece2="r6" />
</pentry>
<pentry minsize="5" maxsize="8">
  <addr space="join" piece1="r7" piece2="r8" />
</pentry>
<pentry minsize="5" maxsize="8">
  <addr space="join" piece1="r9" piece2="r10" />
</pentry>

Basically what happens if you add it is that in 2+ argument functions ghidra sees an argument passed as r3 and r4 and thinks it's 3 arguments, 2 32 bit arguments and a 64 bit argument

DarkKirb commented 3 years ago

Blocked on NationalSecurityAgency/ghidra#2762

muff1n1634 commented 1 year ago

I came across this issue myself, and after a bit of looking around on the Ghidra issues list, I found this comment on https://github.com/NationalSecurityAgency/ghidra/issues/3440 (emphasis mine):

Smaller pentry specs which mention a register must come before the larger ones that mention the same register. Is this a standard calling convention? It is not as symmetric as I would expect.

So I tried interleaving the pentry tags:

<pentry minsize="1" maxsize="4">
    <register name="r3"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r4"/>
  </pentry>
  <pentry minsize="5" maxsize="8">
    <addr space="join" piece1="r3" piece2="r4"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r5"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r6"/>
  </pentry>
  <pentry minsize="5" maxsize="8">
    <addr space="join" piece1="r5" piece2="r6"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r7"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r8"/>
  </pentry>
  <pentry minsize="5" maxsize="8">
    <addr space="join" piece1="r7" piece2="r8"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r9"/>
  </pentry>
  <pentry minsize="1" maxsize="4">
    <register name="r10"/>
  </pentry>
  <pentry minsize="5" maxsize="8">
    <addr space="join" piece1="r9" piece2="r10"/>
  </pentry>
  <pentry minsize="1" maxsize="500" align="4">
    <addr offset="8" space="stack"/>
  </pentry>

It seems to work a little better than the normal __stdcall prototype on some examples I tried.

// So-called "normal" case: long long int arg lines up with odd-even allocation
/* expected allocation:
 *           int  arg1 - r3:4
 *          void *arg2 - r4:4
 * long long int  arg3 - r5:4, r6:4
 *     short int  arg4 - r7:2
 *
 * __stdcall allocation:
 *           int  arg1 - r3:4
 *          void *arg2 - r4:4
 * long long int  arg3 - Stack[0x8]:8 // should not be in stack
 *     short int  arg4 - r5:2
 *
 * patched allocation:
 *           int  arg1 - r3:4
 *          void *arg2 - r4:4
 * long long int  arg4 - r5:4, r6:4
 *     short int  arg4 - r7:2
 */
void f(int arg1, void *arg2, long long arg3, short arg4);

// Problem case: long long int arg does NOT line up with odd-even allocation
/* expected allocation:
 *           int  arg1 - r3:4
 *                    // note: r4 goes unused
 * long long int  arg2 - r5:4, r6:4 // note: same regs as above
 *          void *arg3 - r7:4
 *     short int  arg4 - r8:2
 *
 * __stdcall allocation:
 *           int  arg1 - r3:4
 * long long int  arg2 - Stack[0x8]:8 // should not be in stack
 *          void *arg3 - r4:4
 *     short int  arg4 - r5:2
 *
 * patched allocation:
 *           int  arg1 - r3:4
 * long long int  arg2 - r5:4, r6:4
 *          void *arg3 - r4:4 // should not be in r4
 *     short int  arg4 - r7:2
 */
void g(int arg1, long long arg2, void *arg3, short arg4);

It's still not perfect, but it saves me having to change u64 args from the stack. Just moves the work to changing the unused registers in functions like g(), which is a bit faster.

As it is right now (testing this on 10.3.3), I don't think this is something that can be completely resolved within .cspec files - both allocation strategies mention using the first register that is available, whereas this involves intentionally skipping some. I was thinking maybe a new allocation strategy is needed, for which I would make a new issue upstream.


I've put my small test case in a little bundle.

stdcall_oe_test.zip (no containing folder)

To test the __stdcall_oe.xml prototype (oe = odd-even pairs), load stdcall_oe_test.o and go to Edit -> Options for <program name> -> Specification Extensions -> Import...


e: i now realize that development on this repo has moved to another one but the point still stands