icedland / iced

Blazing fast and correct x86/x64 disassembler, assembler, decoder, encoder for Rust, .NET, Java, Python, Lua
MIT License
2.83k stars 232 forks source link

Add ability to query direction of an operand. #583

Closed marti4d closed 3 weeks ago

marti4d commented 3 weeks ago

Hi there,

We are currently working on a feature for Mozilla's Crash Reporter and Minidump Processor. We are trying to detect "impossible crashes" that may be caused by malfunctioning hardware.

For example, if the CPU reports a crash due to an invalid write, but the crashing instruction doesn't write anything, that would most likely be due to the CPU malfunctioning.

The issue is that neither of the Rust disassemblers that we could use (yaxpeax-x86 or iced-x86) currently report the "direction" of the operands of an instruction.

For example, it would be great to know that in the instruction mov [rax], rbx the first operand is a "WRITE" operand.

Currently, to do this, we would have to match off of every x86 Opcode that accesses memory ourselves to determine the nature of each opcode's access (see this PR for a partial example). This is not something we generally want to do, and so we are kind-of stuck with perhaps switching to a non-Rust disassembler like Capstone (which, to be clear, we don't want to do because it has its own headaches).

Here is the docs for Capstone's take on this feature. You can see the RegAccessType tells you if each operand is ReadOnly, WriteOnly, or ReadWrite (like in the case of an add [rax], rbx instruction.

If there is any way we can offer some help, please let us know... It might actually be easier to fix this in a Rust disassembler rather than trying to switch over to Capstone 😂.

Thanks!

weltkante commented 3 weeks ago

it would be great to know that in the instruction mov [rax], rbx the first operand is a "WRITE" operand.

Am I mixing up argument order? I read that as an indirect write, i.e. the register is only a "READ" operand and never written to, the write happens through an indirection to general memory, right? I don't think the simplistic RegAccessType model is sufficient, you probably want a more complex model that understands indirection.

marti4d commented 3 weeks ago

Hmm... You know... I guess I had never really considered that the wording is actually rather ambiguous for that function! I can totally see how the name RegAccessType makes you think "Well... Isn't rax being read in that instruction?"

But I believe that Capstone returns the direction of the entire operand (which will be a capstone::arch::x86::X86OperandType::Mem), not the direction of the register accessed.

So, IoW, in mov [rax], rbx, the first operand is "the memory located at rax" and the operand itself is therefore being written. And that's what we would want to know for our minidump processor 🙂.

wtfsck commented 3 weeks ago

This example should be what you need:

https://github.com/icedland/iced/blob/master/src/rust/iced-x86/README.md#get-instruction-info-eg-readwritten-regsmem-control-flow-info-etc

Eg. the first instruction it dumps:

00007FFAC46ACDA4 mov [rsp+10h],rbx
    OpCode: o64 89 /r
    Instruction: MOV r/m64, r64
    Encoding: Legacy
    Mnemonic: Mov
    Code: Mov_rm64_r64
    CpuidFeature: X64
    FlowControl: Next
    Displacement offset = 4, size = 1
    Memory size: 8
    Op0Access: Write
    Op1Access: Read
    Op0: r64_or_mem
    Op1: r64_reg
    Used reg: RSP:Read
    Used reg: RBX:Read
    Used mem: [SS:RSP+0x10;UInt64;Write]

you can see that the instruction writes to op0 (first op) (Op0Access: Write).

Also the last lines show that it reads from rsp and rbx and writes to [ss:rsp+0x10], 8 bytes.

wtfsck commented 3 weeks ago

Answered.