NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
50.43k stars 5.76k forks source link

Mac PPC function pointer calls are decompiled awkwardly #4852

Open Schala opened 1 year ago

Schala commented 1 year ago

Describe the bug When decompiling a Macintosh PowerPC binary with function pointer calls, the result, in place , seems bugged. Deleting the function, leaving the subroutine label, is a sort of helpful workaround in that I can at least see the arguments passed, but Ghidra likes to recreate the broken function on every load.

For example, from the Metrowerks CodeWarrior standard C library, there's the __exit function in CodeWarrior version 8.0, in Metrowerks CodeWarrior 8.0/MSL/MSL_C/MSL_Common/Src/abort_exit.c:

void _MSL_CDECL (* __console_exit)(void) = 0;
long    __atexit_curr_func = 0;
void (*__atexit_funcs[max_funcs])(void);

void _MSL_CDECL __exit(int status)
{
    // ... snip ...
#if !(__dest_os == __win32_os)
    __begin_critical_region(atexit_funcs_access);

    while (__atexit_curr_func > 0)
        (*__atexit_funcs[--__atexit_curr_func])();

    __end_critical_region(atexit_funcs_access);
#endif

// ... snip ...

    if (__console_exit)
    {
        (*__console_exit)();
        __console_exit = 0;
    }
       ExitToShell();
// ... snip ...
}

Here is the same exit function in a decompiled binary, using the default analysis options plus scalar operand references. As you can see, the calls to the console_exit and __atexit_funcs[--__atexit_curr_func] function pointers are decompiled to FUN_1002077c, which contains a code calling error from the r12 register:

void .debug::.__exit(int status)

{
  while (0 < ___atexit_curr_func) {
    ___atexit_curr_func = ___atexit_curr_func + -1;
    FUN_1002077c();
  }
  if (__console_exit != (func *)0x0) {
    FUN_1002077c();
    __console_exit = (func *)0x0;
  }
  .glue::ExitToShell();
  return;
}

void FUN_1002077c(void)

{
  code **in_r12;

                    /* WARNING: Could not recover jumptable at 0x1002078c. Too many branches */
                    /* WARNING: Treating indirect jump as call */
  (**in_r12)();
  return;
}

To Reproduce Steps to reproduce the behavior:

  1. Import a Mac OS X Carbon binary, recommended a Mac OS 9/X hybrid binary that you're aware calls function pointers in its code
  2. Analyse using the default options and scalar operand references
  3. The function(s) that call function pointers should have a function similar to FUN_1002077c from the example above, with the jumptable error, attempting to execute data from the r12 register.

Expected behavior As seen in Win32/64 binaries, for instance, the decompiled code should produce a line of code similar to (*PTR_DAT_12345678)(foo, bar) for example.

Environment (please complete the following information):

beshelto commented 1 year ago

This seems like it's due to the way classic Mac OS on PowerPC uses transition vectors (see https://vintageapple.org/inside_r/pdf/PPC_System_Software_1994.pdf, page 1-27, for more details). It would be nice if Ghidra handled this more cleanly -- for example, in a library, automatically naming each decompiled function after the tvect entry that leads to it.