Open BlackMagicCoding opened 7 months ago
Heyo @ryanmkurtz and others, just wanted to poke a little, asking if this is something you still might consider, or if you think it is very unlikely. I am absolutely not in a hurry at all, just wanted to reel in some form of update on it ^^ Best regards, BMC
I directed the ticket to the right team member.
Ahoi!
Context
I am reversing a fairly complex program since a few months already, which includes library functions for interfacing with the LUA scripting language. So far I already have named and typed a bunch of key LUA functions, which are referenced a lot, so some parameters of other functions calling them should be able to be inferred as well - this is not the case in my repository though. It seems that somehow the parameter types have been hard set to
int
instead of theundefined4
, which would automatically display it as a more discrete type in the decompilation view, if it can be inferred via function calls inside etc. The latter might be caused by me running theAuto Analyze
action with theDecompiler Parameter ID
option once - oops... could be wrong though. My goal is now to fix those parameter types (and some names) in bulk via script, when I am certain that those can be inferred correctly.Approach & Example
Here is a yet manually unaddressed function
FUN_0048c5b0
which calls 2 LUA functions:lua_pushnumber
andlua_pushnil
. As you can see the decompilation writes thatparam_1
is cast aslua_State *
, since both take that as an argument. This way I can infer that param_1 is of typelua_State *
and is namedL
(that's just how things are in those LUA functions). For completeness sake and context, here are the function signatures for both:void __cdecl lua_pushnumber(lua_State *L,lua_Number n)
void __cdecl lua_pushnil(lua_State *L)
I already whipped up a very simplistic Ghidra script to gather and print info. This script basically runs over all 29564 functions, gets their decompiled C code and does a String search for
(lua_State *)
and gets the trailing variable name (if it really is a variable cast). Here is the code for that script (please excuse the very crude code, it's just a work in progress lol):Improvements
As you might guess this is rather slooow. It seems, that it does a fresh decompile of each and every function when calling
getDecompiledFunction()
, instead of loading something from cache. Even when running the same script again and again, without altering anything. My goal is to adjust the script later to not only print when finding those casts, but instead alter the function parameter via script and then run multiple times afterwards, since those alterations will lead tolua_State *
casts popping up at new places, because the altered functions now require them. If I am not completely mistaken I think that IDA saves it's decompilation output persistently in it's database/repository and loads it when looking at the same function again. Such a thing would be a huge boost in performance when either dealing with chonky functions or doing bulk edits like me.Implementation
I am absolutely aware that this is the exact opposite of a trivial matter, and that there certainly will be some further questions popping up when starting to implement. Here is what came to my mind when thinking about the implementation:
ghidra.program.model.listing.Function
4599
2424
1871
5730
What are your thoughts on this?
Best regards, BMC