das-labor / panopticon

A libre cross-platform disassembler.
https://panopticon.re
GNU General Public License v3.0
1.43k stars 78 forks source link

refactor + update function API #310

Closed m4b closed 7 years ago

m4b commented 7 years ago

This PR:

  1. makes a friendlier and easier to use Function API
  2. enforces certain invariants on Function (e.g., always has a basic block entry point)
  3. Cleans up some code to use new API where appropriate

Function API

pub struct Function {
    /// Display name of the function.
    pub name: String,
    /// Unique, immutable identifier for this function.
    uuid: Uuid,
    /// Graph of basic blocks and jumps
    cflow_graph: ControlFlowGraph,
    /// The function's entry point
    entry_point: ControlFlowRef,
    /// Name of the memory region the function is part of
    region: String,
    /// The size of this function, in bytes (only counts the number of instructions, not padding bytes, or gaps for non-contiguous functions)
    size: usize,
}

    pub fn undefined(start: u64, uuid: Option<Uuid>, region: &Region, name: Option<String>) -> Function {}

    /// Continue disassembling from `start`, at `region`, with CPU `configuration`, using the functions current, internal control flow graph.
    pub fn cont<A: Architecture>(&mut self, start: u64, region: &Region, configuration: A::Configuration) -> Result<()> {}

    /// Create and start disassembling a new function with `name`, inside memory `region`, starting at entry point `start`, with a random UUID.
    pub fn new<A: Architecture>(start: u64, region: &Region, name: Option<String>, init: A::Configuration) -> Result<Function> {}

    /// Returns the start address of the first basic block in this function
    pub fn start(&self) -> u64 {}

    /// New function starting at `start`, with name `name`, inside memory region `region` and UUID `uuid`.
    pub fn with_uuid<A: Architecture>(start: u64, uuid: &Uuid, region: &Region, name: Option<String>, init: A::Configuration) -> Result<Function> {}

    /// Returns the UUID of this function
    pub fn uuid(&self) -> &Uuid {}

    /// The size of this function, in bytes (only counts the number of instructions, not padding bytes, or gaps for non-contiguous functions)
    pub fn size(&self) -> usize {}

    /// Returns a reference to this functions control flow graph
    pub fn cfg(&self) -> &ControlFlowGraph {}

    /// Returns a mutable reference to this functions control flow graph; **WARNING** this can cause instability if the entry point is not correctly updated
    pub fn cfg_mut(&mut self) -> &mut ControlFlowGraph {}

    /// Returns a reference to the entry point vertex in the cfg
    pub fn entry_point_ref(&self) {}

    /// Sets the functions entry point vertex in the cfg to `vx` (this is primarily for use in tests).
    ///
    /// **WARNING** Make sure the vertex descriptor actually is the entry point _and_ points to a _resolved_ basic block, otherwise subsequent operations on this function will be undefined.
    pub fn set_entry_point_ref(&mut self, vx: ControlFlowRef) {}

    /// Returns a reference to the BasicBlock entry point of this function.
    pub fn entry_point(&self) -> &BasicBlock {}

    /// Returns a mutable reference to the BasicBlock entry point of this function.
    pub fn entry_point_mut(&mut self) -> &mut BasicBlock {}

Examples

Iteration is much easier, whether it's the call graph of the program, the cfg of the function, the basic blocks of the function, or the statements of the basic block, etc.

For example, the difference is striking in some places:

     pub fn find_function_by_uuid_mut<'a>(&'a mut self, a: &Uuid) -> Option<&'a mut Function> {
        let ct = self.call_graph
            .vertices()
            .find(
                |&x| match self.call_graph.vertex_label(x) {
                    Some(&CallTarget::Concrete(ref s)) => s.uuid() == a,
                    _ => false,
                }
            );

        if ct.is_none() {
            return None;
        }

        match self.call_graph.vertex_label_mut(ct.unwrap()) {
            Some(&mut CallTarget::Concrete(ref mut s)) => Some(s),
            _ => None,
        }
     }

after:

     pub fn find_function_by_uuid_mut<'a>(&'a mut self, a: &Uuid) -> Option<&'a mut Function> {
        for ct in self.call_graph.vertex_labels_mut() {
            match ct {
                &mut CallTarget::Concrete(ref mut s) => if s.uuid() == a { return Some(s) },
                _ => (),
            }
        }
        None
     }

Iterating a basic block and its statements is natural and easy now, e.g.:

for bb in func.basic_blocks() {
  for statement in bb.statements() {
    if let &Statement { op: Operation::Call(ref rv), .. } = statement { // do whatever

Similarly, entry_point is now a basic block, and is guaranteed to exist (unless you construct a function using Function::undefined - don't do that!)

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.3%) to 59.799% when pulling f03c031597b17e124784673b72e6470bc8b4e5c0 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.2%) to 59.682% when pulling 4869f89eb1570a77338f1e57f85c483fe8bd7788 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.2%) to 59.715% when pulling 4e3b9e76bb0874244f3882f4a0d2f63160438670 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.4%) to 59.838% when pulling 36347905c59d3d61939f065064204e7d6b47b4e0 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.4%) to 59.869% when pulling 96a5903f64647262d5a35d8563818b3f167bedfb on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.3%) to 59.75% when pulling aa3e7d92515306ef6803b8afed23cf9f8f9a25b5 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

m4b commented 7 years ago

I hate coveralls spamming this and other PRs; is there anything we can do about it?

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.2%) to 59.727% when pulling 8072dc0c30a75438ce4018abcb731beae82274ba on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 60.129% when pulling c013bfd4ea9cb983db183229f0a1bd99429e4955 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

m4b commented 7 years ago

This will fail; that test entry_split in function is failing again. Dunno how to fix it. Since entry point is basically a requirement for functions now, it might be inapplicable.

m4b commented 7 years ago

@flanfly I think I should stop around here. There are some things I'd like to continue with but I think this PR is getting too big and it's probably a good stopping point anyway.

I will remove the into iterator impl from VertexGraph in favor of a Labels associated trait + method using the hashtable iterator to match the mutable versions, and call that directly instead of into_iter where appropriate.

I'll then write up the gist of the new function API and also provide some examples (probably should add this to docs as well)

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 60.045% when pulling e54153c932e05cf032f1acf710faeb35e4d6adf8 on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

m4b commented 7 years ago

Ok @flanfly, I fixed the last test, and added a cont function method to the api for more flexibility (I think it's actually good it's there, hard to see how they will be used/constructed in the future).

I would like to see this reviewed+merged soon if possible/when you have time.

I think it's pretty solid at this point, modulo any concerns you have of course.

coveralls commented 7 years ago

Coverage Status

Coverage increased (+0.6%) to 60.048% when pulling f8e2ad06fe180f1749d2159c310e8b1ddd9c7e8a on m4b/new_function_api into dd48efccf3a964af34edfb06118758ee51909035 on master.

m4b commented 7 years ago

@flanfly It looks like OSX build stalled on QT, not sure why.

So we ready to merge this?

flanfly commented 7 years ago

Thanks for the hard work and your patience!

m4b commented 7 years ago

NP!