capstone-rust / capstone-rs

high-level Capstone system bindings for Rust
211 stars 74 forks source link

Support `cs_disasm_iter` #1

Open gereeter opened 8 years ago

gereeter commented 8 years ago

Introduced in Capstone 3.0.

richo commented 8 years ago

hey, thanks for the ping. I'll see if I can get cs_disasm_iter exposed properly as an Iterator when I get a chance.

richo commented 8 years ago

Hey @gereeter,

I spent some time noodling on this, but I'm not sure how feasible it is without depending really directly on the implementation of slices, which is probably fine but more fiddly than I was hoping to make this.

My branch is at https://github.com/richo/capstone-rs/compare/cs_iter if you want to have a look

gereeter commented 8 years ago

The Rust side of the API looks good to me, though it technically isn't quite as powerful as the original C api. As for avoiding implementation details, disasm_iter could just call cs_malloc, storing the allocation inside the CsIterator. The destructor for CsIterator would clean up that allocation, and that allocation would be passed to every call to cs_disasm_iter.

gereeter commented 8 years ago

Actually, I just remembered that copying the instruction is probably not valid, because of the detail pointer - copying the instruction would not copy its detail, so any further iterations would then overwrite the detail, since all instructions produced would be pointing at the same cs_detail struct.

I think this means that the CsIterator needs to be some form of streaming iterator that only allows access to the current instruction and doesn't allow looking at multiple instructions at once - this also more closely matches the use case described in the Capstone docs. For example:

pub struct InsnIter<'a> {
    handle: &'a Capstone,
    ptr: Unique<Insn>
}

impl<'a> Drop for InsnIter<'a> {
    fn drop(&mut self) {
        cs_free(*self.ptr, 1);
    }
}

impl InsnIter<'a> {
    pub fn new(handle: &'a Capstone) -> InsnIter<'a> {
        InsnIter {
            handle: handle,
            ptr: Unique::new(cs_malloc(handle.csh))
        }
    }

    pub fn next(&mut self, code: &[u8], address: u64) -> CsResult<&Insn>;
}
richo commented 8 years ago

I don't think I need to wrap cs_malloc at all- it's called under the hood by cs_disasm_iter according to the docs. What's more plausible is that I'd want to shove jemalloc in iff it's a rust library that contains the entry point.

gereeter commented 8 years ago

Where are you seeing that documentation? I might be just reading the wrong thing or missing something, but this says that "rather than letting the core allocate memory, user pre-allocates the memory required, then pass it to the core," implying that cs_disasm_iter deliberately does not call cs_malloc. In the example on that page, an instruction is created with cs_malloc and passed to cs_disasm_iter, not any null pointer:

// allocate memory cache for 1 instruction, to be used by cs_disasm_iter later.
cs_insn *insn = cs_malloc(handle);

//...

// disassemble one instruction a time & store the result into @insn variable above
while(cs_disasm_iter(handle, &code, &code_size, &address, insn)) {
    //...
}
richo commented 8 years ago

Aha! You're right, I was just misreading. That makes things a little clearer, I can definitely reimplement the current iterator in these terms. Thanks for the nudge!

Dethada commented 5 years ago

any progress on this?

tmfink commented 5 years ago

any progress on this?

HanabishiRecca commented 8 months ago

A somewhat working wrapper:

struct CapIter<'a, 'b> {
    cs: &'a Capstone,
    ptr: *const u8,
    size: usize,
    addr: u64,
    insn: *mut cs_insn,
    _data: PhantomData<&'b [u8]>,
}

impl<'a, 'b> CapIter<'a, 'b> {
    fn new(cs: &'a Capstone, data: &'b [u8], addr: u64) -> Self {
        CapIter {
            cs,
            ptr: data.as_ptr(),
            size: data.len(),
            addr,
            insn: unsafe { cs_malloc(cs.csh()) },
            _data: PhantomData,
        }
    }
}

impl<'a, 'b> Iterator for CapIter<'a, 'b> {
    type Item = Insn<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        unsafe {
            if cs_disasm_iter(
                self.cs.csh(),
                &mut self.ptr,
                &mut self.size,
                &mut self.addr,
                self.insn,
            ) {
                return Some(Insn::from_raw(self.insn));
            }
        }

        None
    }
}

impl<'a, 'b> Drop for CapIter<'a, 'b> {
    fn drop(&mut self) {
        if !self.insn.is_null() {
            unsafe { cs_free(self.insn, 1) };
        }
    }
}