capstone-engine / capstone

Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, TriCore, Webassembly, XCore and X86.
http://www.capstone-engine.org
7.59k stars 1.56k forks source link

Proceeding disassembling even if there occur invalid opCodes when using Java binding #1722

Open PAX523 opened 3 years ago

PAX523 commented 3 years ago

I'm using Java binding to invoke the disassembler. I have determined the following settings:

        Capstone disassembler = new Capstone(cpuArchitecture, cpuMode);
        disassembler.setDetail(Capstone.CS_OPT_ON);

Capstone stops at following invalid instruction:

  mov ax, 4
  db 0x0F          ; invalid instruction (4 bytes)
  db 0x3F
  db 0x07
  db 0x0B
  nop

  invoke ExitProcess, 0  

I'd like it to proceed disassembling after invalid instructions. The documentation tells something about CS_OPT_SKIPDATA option. I cannot find a possibility in the Java binding in order to set this option, see available options: https://github.com/aquynh/capstone/blob/master/bindings/java/capstone/Capstone.java

I examined the C implementation in order to inspect the integer behind that constants: https://github.com/aquynh/capstone/blob/master/include/capstone/capstone.h

Even if I set detail to 5 it stops before the invalid bytes mentioned above.

The API documentation of the following method states:

disassembler.disasm(...);

Disassemble instructions from @code assumed to be located at @address,stop when encountering first broken instruction.

I don't want to stop at the first broken instruction.

keenk commented 3 years ago

Confirming this is an issue. I tried adding the missing OPT flags to the Java bindings, but it requires more than that. The Java bindings don't directly wrap... I think it was setOption. Instead they have several other methods that in turn call setOption. I tried to add one for skipdata, but didn't get it to work before I ran out of time.

EDIT: Out of curiosity, why don't you want to stop on bad disassembly?

PAX523 commented 3 years ago

Hmm... Good question... It's true that it doesn't make sense not to stop any time, when an instruction is broken.

But one valid use case comes to my mind: Take a look at the 4 bytes mentioned in my first comment. They are intentionally invalid in order to verify the execution behavior which is different in VirtualPC VM. See:

https://paper.bobylive.com/Meeting_Papers/BlackHat/USA-2012/BH_US_12_Branco_Scientific_Academic_WP.pdf p. 23, "5.3 VirtualPC - Invalid Instruction"

When an invalid instruction is executed, an exception is raised and it can be handled by the software using try/catch mechanism [31]. VirtualPC [44] relies on invalid instructions to interface between virtual machines and VirtualPC software itself. An example is the invalid instruction “0x0F 0x3F 0x07 0x0B”, which does not generate an exception inside a VirtualPC virtual machine. This can be used to detect if an application is running inside a VirtualPC virtual machine.

Personally, I circumvented the issue that capstone doesn't give me an instruction in that case, by analyzing the next few raw bytes after the very last valid instruction from capstone.

It would be nice if, at least, capstone provides a pseudo instruction (db, dw, dd etc.) for the instruction that was detected as broken and stops then. But, as already said, my traversal recursive integration of capstone manages that issue on its own way, now.

Cheers PAX