Decompollaborate / spimdisasm

MIPS disassembler
https://pypi.org/project/spimdisasm/
MIT License
45 stars 13 forks source link

help file #166

Closed mediotex closed 4 months ago

mediotex commented 4 months ago

I installed spimdisasm, but have some problem: its unclear how to setup the correct command format to disassemle raw MIPS machine code. (image.out)

~/.local/bin$ python3 spimdisasm singleFileDisasm --arch-level MIPS32 --endian little binary image1.out >result
Warning: globalSegment's will has its vromStart equal to the vromEnd (0x0)
Traceback (most recent call last):
  File "/home/testkon/.local/bin/spimdisasm", line 8, in <module>
    sys.exit(cliMain())
  File "/home/testkon/.local/lib/python3.9/site-packages/spimdisasm/frontendCommon/FrontendUtilities.py", line 273, in cliMain
    return int(args.func(args))
  File "/home/testkon/.local/lib/python3.9/site-packages/spimdisasm/singleFileDisasm/SingleFileDisasmInternals.py", line 186, in processArguments
    fec.FrontendUtilities.writeProcessedFiles(processedFiles, processedFilesOutputPaths, processedFilesCount, progressCallback)
  File "/home/testkon/.local/lib/python3.9/site-packages/spimdisasm/frontendCommon/FrontendUtilities.py", line 152, in writeProcessedFiles
    mips.FilesHandlers.writeSection(filePath, f)
  File "/home/testkon/.local/lib/python3.9/site-packages/spimdisasm/mips/FilesHandlers.py", line 63, in writeSection
    path.parent.mkdir(parents=True, exist_ok=True)
  File "/usr/lib/python3.9/pathlib.py", line 1312, in mkdir
    self._accessor.mkdir(self, mode)
FileExistsError: [Errno 17] File exists: 'image1.out'

image.out.zip

AngheloAlf commented 4 months ago

It is complaining because you passed the arguments on the wrong order, it is interpreting binary as the input file and image1.out as the desired output directory. Also, the zip file you posted contains a image.out file, not a image1.out.

Btw, are you sure this file contains MIPS instructions? it doesn't seem to be the case, at least not at the beginning of the file. If the executable part of the binary is not at the beginning of the file you may want to use the --start and --end parameters to specify the offset within the file, you'll probably want to pass the --vram parameter too.

mediotex commented 4 months ago

On my first try I just followed help file on positional arguments:

 binary                Path to input binary
 output                Path to output. Use '-' to print to stdout instead

I was wrong here, I realized it then. image1.out is the same file with image.out. I just tried on another laptop. I got results, but its contains a plenty of invalid instructions and not suitable for further use. Yes, its a raw MIPS machine code and its loaded at RAM address 0x80004000 Attached is output of binwalk scan for common signatures (-B option) and executable opcode signatures (-A option). binwalk.zip

Also, binwalk have refereces either to little endian and big endian both. I'm not sure what is correct endianness format to specify. Can you show a full command to disassemble this file?

AngheloAlf commented 4 months ago

As I mentioned on my earlier comment, you probably need to pass a --start and --end to specify the offsets within the binary file. Assuming the output you posted is correct then you could try disassembling that first chunk of code by passing --start 0x1F0 --end 0x623C --vram 0x80000300, so something like this

$ python3 -m spimdisasm singleFileDisasm image.out out_dir --start 0x1F0 --end 0x623C --vram 0x80000300

(spimdisasm assumes big endian by default)

If you get any chunk of stuff that does not get disassembled but you know it should be valid code then you could slap --disasm-unknown and see what it produces.

mediotex commented 4 months ago

Now is much better, a lot of / invalid instruction / marks still in output. But I'm not sure does it should be valid code or not. I run 2nd test with --disasm-unknown argument, both output attached. Is it possible to define what is correct endianness format for that binary? And how it can affect on disassembling result? output.zip

AngheloAlf commented 4 months ago

By taking a look at what you posted it seems like the endian is correct, otherwise you would get garbage instead of readable assembly. The "invalid instructions" may be caused because of two reasons: Either those are just data instead of actual instructions or those are part of a MIPS extension. spimdisasm knows about a few extensions, you can try then with the --instr-category parameter. spimdisasm only knows about a few extensions, mainly console related extensions like Sony consoles, so if this binary is not from a game then you'll be a bit out of luck

mediotex commented 4 months ago

Sorry, I uploaded wrong output of image1_80000300.text.s [294,2 MiB] - this large file was result of first attempt to disassemble with wrong command. The output with correct command is 291,5 KiB and have only 69 invalid instruction marks. I reuploaded with correct file.

AngheloAlf commented 4 months ago

Did you tried what I mentioned earlier? It should still apply

mediotex commented 4 months ago

With --instr-category argument, where category is 'cpu'?

$ python3 -m spimdisasm singleFileDisasm image.out out_dir --start 0x1F0 --end 0x623C --vram 0x80000300 --instr-category cpu

AngheloAlf commented 4 months ago

No, cpu is the default, you could try the other ones

mediotex commented 4 months ago

python3 -m spimdisasm singleFileDisasm image1.out out_dir2 --start 0x1F0 --end 0x623C --vram 0x80000300 --disasm-unknown --instr-category rsp

# INVALID -- 196 marks rsp_image1_80000300.text.s.zip

mediotex commented 4 months ago

with r3000gte,r4000allegrex,r5900 5 marked #INVALID and 227 marked handwritten instruction What mean 'handwritten instruction'?

instructions_image1_80000300.text.s.zip

AngheloAlf commented 4 months ago

The "handwritten instruction" message is just a hint to tell the user the given instruction is usually not emitted normally by C compilers, so it may have been part of handwritten assembly instead of generated assembly from a compiler.

Since none of the MIPS extensions take care of all the invalid ones then this may be just another extension we don't know of. Where does this binary come from? That may give you a clue to know what instruction set is being used by that binary.

mediotex commented 4 months ago

Its cable modem firmware, use Broadcom eCos real-time operating system for MIPS arch.

AngheloAlf commented 4 months ago

Oh, I didn't knew MIPS was still used today, it may explain the unrecognized instructions. spimdisasm focuses mainly on older MIPS versions (up to MIPS IV), just because it was born to reverse engineer old games like Nintendo 64 ones. If this program uses a newer MIPS version then spimdisasm won't catch many of the new instructions. There are no plans to add support to newer MIPS versions any time soon. If you need a full proper disassembly you'll have to rely on a different tool, sorry.

mediotex commented 4 months ago

In order to decompile asm code to pseudo-C, do I need to have a full disassembled code without errors, or its possible to skip invalid instructions without breaking logic and relationship between functions?

AngheloAlf commented 4 months ago

You could get a general idea with a partial disassembly, but you may miss some important bits if you are missing instructions. It kinda depends on what you are aiming for. Since the disassembly is almost complete you could try using some decompiler like m2c (https://github.com/matt-kempster/m2c), it gives pretty good results.

Either way, I'm closing this issue since the original problem seems to be solved.