matt-kempster / m2c

A MIPS and PowerPC decompiler.
GNU General Public License v3.0
386 stars 46 forks source link

Decompiler doesn't work for me #278

Open mediotex opened 1 week ago

mediotex commented 1 week ago

I tried your m2c decompiler. I've installed dependencies and tried to run against the raw MIPS machine code (image.out) The decompiler does not works for me. Possibly, I specified incomplete or incorrect options, though. The image.out is results of extraction of raw MIPS code using Broadcom ProgramStore tool. Compiler C++ ProgramStore -x -f file.bin -o image.out The code is loaded at RAM address 0x80004000 Can you advise? image.out.zip

simonlindholm commented 1 week ago

Usage instructions are somewhat hidden in the README, but they are there:

The input is expected to match the GNU as assembly format, produced by tools like spimdisasm. See the tests/ directory for some example input and output.

Which is to say, m2c isn't able to disassemble MIPS on its own, you need to do that yourself first. This may involve applying some heuristics to detect and give names to symbols, e.g. you may want to say lui $a0, %hi(mysymbol), lw $a0, %lo(mysymbol)($a0) instead of lui $a0, 0x1234, lw $a0, 0x5678($a0) for better results.

mediotex commented 1 week ago

I have this code disassembled with MIPS bytecode disassembler which looks like this:

80004000:   40809000    mtc0 $0,$18,0
80004004:   00000000    sll $0,$0,0x0
80004008:   00000000    sll $0,$0,0x0
8000400c:   00000000    sll $0,$0,0x0
80004010:   40809800    mtc0 $0,$19,0
80004014:   00000000    sll $0,$0,0x0
80004018:   00000000    sll $0,$0,0x0
8000401c:   00000000    sll $0,$0,0x0
80004020:   40806800    mtc0 $0,$13,0
80004024:   00000000    sll $0,$0,0x0
80004028:   3c021000    lui $2,0x1000
8000402c:   3442ff00    ori $2,$2,0xff00
80004030:   40826000    mtc0 $2,$12,0
80004034:   00000000    sll $0,$0,0x0
80004038:   00000000    sll $0,$0,0x0
8000403c:   00000000    sll $0,$0,0x0
80004040:   24020002    addiu $2,$0,2
80004044:   40828000    mtc0 $2,$16,0
80004048:   00000000    sll $0,$0,0x0
8000404c:   00000000    sll $0,$0,0x0
80004050:   00000000    sll $0,$0,0x0
80004054:   0c001085    jal 0x80004214
80004058:   00000000    sll $0,$0,0x0
8000405c:   40026000    mfc0 $2,$12,0
80004060:   00000000    sll $0,$0,0x0
80004064:   3c03ffff    lui $3,0xffff
80004068:   346300ff    ori $3,$3,0xff
8000406c:   00431024    and $2,$2,$3
80004070:   40826000    mtc0 $2,$12,0
80004074:   00000000    sll $0,$0,0x0
80004078:   00000000    sll $0,$0,0x0
8000407c:   00000000    sll $0,$0,0x0
... ...

Is it suitable for m2c?

simonlindholm commented 1 week ago

Roughly. It chokes on mtc0 and mfc0 because they're supposed to take two arguments, but other than that:

glabel foo
mtc0 $0,$18
sll $0,$0,0x0
sll $0,$0,0x0
sll $0,$0,0x0
mtc0 $0,$19
sll $0,$0,0x0
sll $0,$0,0x0
sll $0,$0,0x0
mtc0 $0,$13
sll $0,$0,0x0
lui $2,0x1000
ori $2,$2,0xff00
mtc0 $2,$12
sll $0,$0,0x0
sll $0,$0,0x0
sll $0,$0,0x0
addiu $2,$0,2
mtc0 $2,$16
sll $0,$0,0x0
sll $0,$0,0x0
sll $0,$0,0x0
jal 0x80004214
sll $0,$0,0x0
mfc0 $2,$12
sll $0,$0,0x0
lui $3,0xffff
ori $3,$3,0xff
and $2,$2,$3
mtc0 $2,$12
sll $0,$0,0x0
sll $0,$0,0x0
sll $0,$0,0x0
Warning: missing "jr $ra" in last block (.initial).

void foo(void) {
    M2C_ERROR(/* mtc0 $0, $18 */);
    M2C_ERROR(/* mtc0 $0, $19 */);
    M2C_ERROR(/* mtc0 $0, $13 */);
    M2C_ERROR(/* mtc0 $2, $12 */);
    M2C_ERROR(/* mtc0 $2, $16 */);
    (? (*)())0x80004214();
    M2C_ERROR(/* mtc0 $2, $12 */);
}

We don't support mtc0/mfc0 because there's no compilers that generate them and so it hasn't been prioritized, so the output is pretty much garbage, but it does work.

I would recommend spimdisasm for the disassembler, it would give you nicer asm (e.g. nops instead of sll $0, $0, 0, and register names).

mediotex commented 1 week ago

I will try spimdisasm. What is proper command options for m2c in my case?

simonlindholm commented 1 week ago

./m2c.py input.s should work, pretty much. --help to see more options, --context is the most important other one but it's optional. You can play around with https://simonsoftware.se/other/m2c.html too if that's easier.

mediotex commented 1 week ago

I installed spimdisasm, can't figure out the correct command format, it have a vaste amount of various options, readme isn't too helpful.

simonlindholm commented 1 week ago

Yeah, it's a bit complex... normally in the N64 matching decomp world it's used in combination with splat (which is somewhat complex to setup too); I don't know if matching decomp is what you're interested in though. You can join the decomp.me discord if you're interested in getting in touch with people who know this tooling better.

Ghidra is also pretty decent, especially if you don't care about matching decomp or you're targeting code compiled by a more modern compiler.

mediotex commented 1 week ago

well, fist I'd want try a spimdisasm + m2c, to see what they can do. I just a little confused about the spimdisasm command format I need to run.

simonlindholm commented 1 week ago

From the readme singleFileDisasm looks like the right thing. I haven't used it myself though; as mentioned you can ask people in the decomp.me discord if you have troubles.

mediotex commented 1 week ago

yes, I used this command, where I got confused is just how to specify input file and output file positional arguments. EDIT: ok, I got disassembled file, size 294,2 MiB: should I process it as a whole with ./m2c.py input.s or split info parts?

mediotex commented 1 week ago
~/m2c-master$ python3 m2c.py /home/testlab/image1_00000000.text.s
/*
Decompilation failure in function T_00000000:

Function T_00000000 contains no instructions. Maybe it is rodata?
*/

file

simonlindholm commented 1 week ago

EDIT: ok, I got disassembled file, size 294,2 MiB: should I process it as a whole with ./m2c.py input.s or split info parts?

Up to you, either should work. I doubt you have 300 MB of code though, most of that is probably data.

The file you posted has a lot of invalid instructions, which is probably why it's failed to disassemble. But again this isn't really the right forum for that.

mediotex commented 1 week ago

Invalid instructions when disassembling with spimdisasm?

simonlindholm commented 1 week ago

Yes, the .s file clearly says so if you open it and look at what it contains

mediotex commented 1 week ago

So, I used a recommended utility (spimdisasm) to disassemble file, but it didn't work because it can't handle all types of files and formats and has limited capabilities.

simonlindholm commented 1 week ago

I don't think that's the right takeaway, I think it's more likely a usage error in trying to disassemble something as code that isn't actually code.

mediotex commented 1 week ago

so tried again with new disassembled file, output still have some invalid instructions, but its not possible to avoid them. m2c decompiler still fails. Is it able to process only the simple kind of files, like games, nintendo, etc?

error.zip

simonlindholm commented 6 days ago

No, both spimdisasm and m2c are general-purpose tools. m2c erroring here is expected, since you're not giving it any asm instructions to process. It definitely looks like you're using spimdisasm wrong; you both somehow ending up with symbol names from the n64 stdlib and not disassembling anything. Again, I can't help you with that, please ask someone who knows that tool.

mediotex commented 6 days ago

that's strange. I used command python3 -m spimdisasm singleFileDisasm image.out out_dir --start 0x1F0 --end 0x623C --vram 0x80000300 that provide the spimdisasm maintainer, he said disassembly is almost complete except those invalid instructions.