matt-kempster / m2c

A MIPS and PowerPC decompiler.
GNU General Public License v3.0
411 stars 49 forks source link

s32 typedef should be long instead of int in PPC code #238

Open camthesaxman opened 2 years ago

camthesaxman commented 2 years ago

The Gamecube/Wii headers typedef s32 as long, whereas m2c's default typedefs assume that s32 is int. https://github.com/matt-kempster/m2c/blob/master/src/c_types.py#L554

This ends up changing function parameters in the context from int to s32 and from unsigned int to u32. For example, if a function being decompiled is declared void myfunc(int arg0); in the context, it will get changed to void myfunc(s32 arg0) in the decompiler output.

One solution to this problem would be to only add these default typedefs if they are not present in the scratch, which will allow the user to override them.

zbanks commented 2 years ago

I think the issue here is more complicated than the built-in typedefs.

My understanding is that long and int both represent signed 4-byte integers for most GC/Wii games (scratch). m2c does assume this in primitive_size. (This could be wrong for other flags/compilers/arches, but I don't think that's what you're asking about here.)

However, the add_builtin_typedefs you link to are only used for parsing the context: they're only used when e.g. s32 is in the context, they aren't involved with the name of the output type.

Instead, the reason s32 shows up in the output type even though it was an int or long in the context file is because of Type._to_ctype. This function will never can an int type "int" or "long": it will always use the "{s,u}{8,16,32,64}" naming convention. (It only looks at typedefs to name structs, more as a special-case.)

Unfortunately, I think making m2c's type system know the difference between int and long would be really tricky. To the decompiler, they look equivalent. And in m2c, when two Types are equivalent, they become fully interchangeable: the Type.unify function will literally make them have the same TypeData object. (And this is really useful most of the time! It'd be hard to relax.)

That being said, it might be possible to work around this specifically for function definitions? We could try to use the "raw" CType function declaration directly from the context if it's available, instead of building the definition from the derived Type?

(Separately, it would be straightforward to add a flag to always emit int (or long) instead of s32, though that doesn't actually address the original problem!)

camthesaxman commented 2 years ago

That makes sense. It would be a mess to keep track of types everywhere. But I think in the case of function prototypes for a decompiled function, it should copy exactly the types that the user specified in the context instead of rewriting it. I know it's able to do this with enums already and not have them decay to a generic s32.