Possible data race when disassembling in a multi-threaded environment

G33KatWork commented 4 years ago

Hi!

I've been playing around with Capstone to test a custom assembler I'm writing right now. During a test I take my assembled instruction and let Capstone disassemble it. I then compare the returned operands to determine if my assembler did the correct thing or not.

The rust tests are executed in a multi-threaded environment and sometimes, I get wrong results. When I execute the test suite with just a single thread, the tests all work fine. If the results are wrong, there is always one operand missing.

My test instruction right now is a xor eax, 0x80000000. The decoded result after some conversion is supposed to be (ignore the Rax, instead of Eax, it's correct!):

xor [Register(Rax), Immediate(-2147483648)]

Instead, when it fails, I always get this:

xor [Immediate(-2147483648)]

I created a small test application for you. It won't fail all the time, but when I just ran it like 3 or 4 times, it failed at least once

use capstone::prelude::*;
use std::thread;

#[derive(Clone, Copy, PartialEq, Eq, Debug, PartialOrd, Ord)]
pub enum Reg {
    Rax, Rcx, Rdx, Rbx, Rsp, Rbp, Rsi, Rdi,
    R8, R9, R10, R11, R12, R13, R14, R15,

    Rip,
}

impl From<&str> for Reg {
    fn from(regname: &str) -> Self {
        match regname.to_uppercase().as_str() {
            "AX"   | "EAX"  | "RAX" => Reg::Rax,
            "CX"   | "ECX"  | "RCX" => Reg::Rcx,
            "DX"   | "EDX"  | "RDX" => Reg::Rdx,
            "BX"   | "EBX"  | "RBX" => Reg::Rbx,
            "SP"   | "ESP"  | "RSP" => Reg::Rsp,
            "BP"   | "EBP"  | "RBP" => Reg::Rbp,
            "SI"   | "ESI"  | "RSI" => Reg::Rsi,
            "DI"   | "EDI"  | "RDI" => Reg::Rdi,
            "R8W"  | "R8D"  | "R8"  => Reg::R8 ,
            "R9W"  | "R9D"  | "R9"  => Reg::R9 ,
            "R10W" | "R10D" | "R10" => Reg::R10,
            "R11W" | "R11D" | "R11" => Reg::R11,
            "R12W" | "R12D" | "R12" => Reg::R12,
            "R13W" | "R13D" | "R13" => Reg::R13,
            "R14W" | "R14D" | "R14" => Reg::R14,
            "R15W" | "R15D" | "R15" => Reg::R15,
            "RIP" => Reg::Rip,
            _ => panic!("Unknown register name {}", regname)
        }
    }
}

#[derive(Clone, Copy, PartialEq, Eq, Debug, PartialOrd, Ord)]
pub enum Operand {
    Register(Reg),
    Memory(Option<Reg>, Option<(Reg, u32)>, i64),
    Immediate(i64),
}

fn disas(code: &[u8]) -> CsResult<(String, Vec<Operand>)> {
    let cs = Capstone::new()
        .x86()
        .mode(arch::x86::ArchMode::Mode64)
        .syntax(arch::x86::ArchSyntax::Intel)
        .detail(true)
        .build()?;

    let insns = cs.disasm_count(code, 0x0, 1)?;
    let disas = insns.iter().next().map(|i| {
        let detail: InsnDetail = cs.insn_detail(&i).unwrap();
        let arch_detail: ArchDetail = detail.arch_detail();
        let x86_arch_detail = arch_detail.x86().unwrap();

        let mut operands = vec![];
        let ops = x86_arch_detail.operands();

        for op in ops {
            //println!("{:?}", op);
            let o = match op.op_type {
                capstone::arch::x86::X86OperandType::Reg(r) => {
                    Operand::Register(Reg::from(cs.reg_name(r).unwrap().as_str()))
                },
                capstone::arch::x86::X86OperandType::Imm(i) => {
                    // Sign-extend the immediate properly
                    match op.size {
                        1 => Operand::Immediate(i as i8  as i64),
                        2 => Operand::Immediate(i as i16 as i64),
                        4 => Operand::Immediate(i as i32 as i64),
                        8 => Operand::Immediate(i        as i64),
                        _ => panic!("Can't handle weird operand size of immediate")
                    }
                },
                capstone::arch::x86::X86OperandType::Mem(m) => {
                    let base = cs.reg_name(m.base()).map(|x| Reg::from(x.as_str()));
                    let index = cs.reg_name(m.index()).map(|x| (Reg::from(x.as_str()), m.scale() as u32));
                    let disp = m.disp();

                    Operand::Memory(base, index, disp)
                },
                capstone::arch::x86::X86OperandType::Invalid => panic!("Got invalid operand from capstone")
            };
            operands.push(o);
        }

        (i.mnemonic().unwrap().into(), operands)
    }).unwrap();

    Ok(disas)
}

//xor eax, 0x80000000
const CODE: &'static [u8] = b"\x35\x00\x00\x00\x80";

fn test() {
    let expected_mnemonic = "xor";
    let expected_operands = &[Operand::Register(Reg::Rax), Operand::Immediate(-2147483648)];

    let (mnemonic, operands) = disas(CODE).unwrap();
    assert!(
        mnemonic == expected_mnemonic && operands == expected_operands,
        "Wrong result. Expected {} {:?} - Received {} {:?}",
        expected_mnemonic, expected_operands,
        mnemonic, operands
    );
}

static NTHREADS: i32 = 32;
fn main() {
    let mut threads = vec![];

    for _ in 0..NTHREADS {
        threads.push(thread::spawn(|| {
            test();
        }));
    }

    for t in threads {
        let _ = t.join();
    }
}

tmfink commented 4 years ago

@G33KatWork thanks for the issue! I'll take a look. The test case makes it a lot easier to debug.

G33KatWork commented 4 years ago

I just compiled the PoC with thread-sanitization. At first the bug seemed to have vanished, but I then just ran the executable in an endless loop over and over again which yielded this: https://gist.github.com/G33KatWork/7e14d98fab99148262164e52fcb10e31

insn_regs_intel_sorted indeed is a global variable in capstone itself which is only referenced by the function X86_insn_reg_intel. So allocating it on the stack inside of that function if size allows for it or on a thread-safe heap (dumb question, but are usual libc heap allocators even thread-safe?) might fix it.

G33KatWork commented 4 years ago

I just verified that this is indeed the bug. Executing test() once in the main function before starting all the threads triggers the path that does the racy initialization and now the assertions hold all the time. Yay!

tmfink commented 4 years ago

Per my comment on the upstream capstone issue, it appears this bug was fixed in the next branch. https://github.com/aquynh/capstone/issues/1647#issuecomment-647044700

We may fix this bug by upgrade capstone-rs's captive capstone version.

capstone-rust / capstone-rs

Possible data race when disassembling in a multi-threaded environment #87