New allocation strategy that intentionally skips registers

muff1n1634 commented 12 months ago

Is your feature request related to a problem? Please describe.

In my time using Ghidra to disasemble PowerPC (specifically, ppc32-eabi), I have come across functions whose prototypes contain long long arguments. Normally, Ghidra allocates these into the stack, because they are larger than the natural word size of the processor. However, according to the ABI (the SysV ABI, PowerPC Processor Supplement[^abi], p. 3-20), LONG_LONG arguments must go in odd-even register pairs, skipping registers as necessary to align (unless doing so would go past r10, in which case it would go into the stack.)

[^abi]: The actual PowerPC EABI 32-bit Implementation (p. 5-6) mostly follows the SysV PowerPC ABI. Importantly, it does not change parameter passing rules for long long values.

Neither of the current allocation strategies do this; they both use the first register that is available and large enough.

Describe the solution you'd like

I would like a new allocation strategy that intentionally skips registers to be able to accurately represent the parameter passing rules of this ABI.

For example, given the following pentries

  <pentry minsize="1" maxsize="4"> <!-- #1 -->
    <register name="r3"/>
  </pentry>
  <pentry minsize="1" maxsize="4"> <!-- #2 -->
    <register name="r4"/>
  </pentry>
  <pentry minsize="5" maxsize="8"> <!-- #3 -->
    <addr space="join" piece1="r3" piece2="r4"/>
  </pentry>
  <pentry minsize="1" maxsize="4"> <!-- #4 -->
    <register name="r5"/>
  </pentry>
  <pentry minsize="1" maxsize="4"> <!-- #5 -->
    <register name="r6"/>
  </pentry>
  <pentry minsize="5" maxsize="8"> <!-- #6 -->
    <addr space="join" piece1="r5" piece2="r6"/>
  </pentry>
  <pentry minsize="1" maxsize="4"> <!-- #7 -->
    <register name="r7"/>
  </pentry>
  <pentry minsize="1" maxsize="4"> <!-- #8 -->
    <register name="r8"/>
  </pentry>
  <pentry minsize="5" maxsize="8"> <!-- #9 -->
    <addr space="join" piece1="r7" piece2="r8"/>
  </pentry>
  <!-- etc... -->

and the C function prototype

void g(int arg1, long long arg2, void *arg3, short arg4);

/* PowerPC 32-bit big-endian
 * sizeof(int) == 4
 * sizeof(long long) == 8
 * sizeof(void *) == 4
 * sizeof(short) == 2
 */

this new strategy would behave as follows.

Start from the beginning of the list. Find next pentry 1. pentry 1 allows for 4 byte wide parameters, and r3 has not been used, so pentry 1 is valid. Assign arg1 to r3.
Start from pentry 1 (not from the beginning). Find next pentry 2. pentry 2 does not allow 8 byte wide parameters, so pentry 2 is not valid.
Find next pentry 3. pentry 3 allows for 8 byte wide parameters, but r3 has already been used, so pentry 3 is not valid.
Find next pentry 4. pentry 4 does not allow 8 byte wide parameters, so pentry 4 is not valid.
Find next pentry 5. pentry 5 does not allow 8 byte wide parameters, so pentry 5 is not valid.
Find next pentry 6. pentry 6 allows for 8 byte wide parameters, and neither r5 nor r6 have been used, so pentry 6 is valid. Assign arg2 to r5 and r6.
Start from pentry 6 (not from the beginning). Find next pentry 7. pentry 7 allows for 4 byte wide parameters, and r7 has not been used, so pentry 7 is valid. Assign arg3 to r7.
Start from pentry 7 (not from the beginning). Find next pentry 8. pentry 8 allows for 2 byte wide parameters, and r8 has not been used, so pentry 8 is valid. Assign arg4 to r8.
All parameters have been allocated.

Basically, what the standard strategy does, but instead of starting at the beginning of the list, it starts at the pentry it just allocated.

This would follow the ABI's rules about skipping registers that aren't aligned to the type's natural alignment.

Describe alternatives you've considered

The obvious alternative is to use Custom Storage to manually correct any incorrect allocation, but for more than a few functions, this gets tedious.

Another alternative I considered was editing and importing a .cspec prototype that added the pentries seen above and then use Custom Storage from there, but it would be an imported Specification Extension, so I'm not sure how compatible that is with future versions of Ghidra.

Additional context

During my research into this topic, I found out that there are other ABIs with this behavior of skipping registers: notably, the RISC-V ABI (p. 90) and the ARM ABI (p. 21). Therefore, this new allocation strategy would not be limited in usefulness to just PowerPC code; people disassembling RISC-V or ARM code could make more accurate prototypes with this new allocation strategy as well.

I put a small test case in a little bundle. Originally this was to show an updated prototype that could handle long long parameters more accurately than the normal PowerPC __stdcall prototype by storing them in aligned registers (for a different issue in a different repo), but I believe the premise of this feature request is similar enough that I've also put it here.

stdcall_oe_test.zip

__stdcall_oe.xml - The aforementioned second alternative I considered, where I added pentries for the long long parameters to be able to be stored in registers.
stdcall_oe_test.c - source file with void f(int arg1, void *arg2, long long arg3, short arg4) and void g(int arg1, long long arg2, void *arg3, short arg4) to show the differences in register allocation for different placements of the long long parameter.
stdcall_oe_test.o - compiled with Metrowerks CW 4.3 with the PowerPC EABI[^object].

To test the __stdcall_oe.xml prototype (oe = odd-even pairs), load stdcall_oe_test.o into Ghidra and go to Edit -> Options for <program name> -> Specification Extensions -> Import...

[^object]: However, I've also tested this by compiling with Clang and --target=arm-none-eabi, in which case g() becomes the normal case, and f() becomes the problem case, as the ARM ABI uses even-odd alignment, starting at r0.

emteere commented 11 months ago

There is a change in the works for this type of allocation scheme as well as other types of allocations schemes. I can't give you a time frame when it will be finished as another developer is working on it.

As you mention, it can be done manually, but is tedious and would be much better done automatically.

muff1n1634 commented 11 months ago

That's good to hear 🙂

NationalSecurityAgency / ghidra