angr / simuvex

[DEPRECATED] A symbolic execution engine for the VEX IR
BSD 2-Clause "Simplified" License
79 stars 57 forks source link

Error when trying to add a custom plugin to simuvex #14

Open axt opened 8 years ago

axt commented 8 years ago

I'm trying to add a simuvex plugin the following way:

#! /usr/bin/env python

import angr
import simuvex

from simuvex  import SimStatePlugin

class SimStateTest(SimStatePlugin):
    def __init__(self):
        SimStatePlugin.__init__(self)

    def copy(self):
        regs = self.state.regs
        print regs.ip
        print regs.eax
        return SimStateTest()

    def merge(self, others, flag, flag_values): 
        return False, [ ]

    def widen(self, others, flag, flag_values):
        return False

    def clear(self):
        pass

if __name__ == "__main__":
    proj = angr.Project("./a", load_options={'auto_load_libs':False}) #, 'main_opts': {'custom_base_addr': 0x0}})
    main = proj.loader.main_bin.get_symbol("main")
    start_state = proj.factory.blank_state(addr=main.addr, plugins={'test':SimStateTest()})
    start_state.stack_push(0x0)
    cfg = proj.analyses.CFG(fail_fast=True, starts=[main.addr], initial_state=start_state, context_sensitivity_level=3, keep_state=True, call_depth=5)

It works fine. If I remove the # in front of print regs.eax in the copy function, I get the following error:

Traceback (most recent call last):
  File "test.py", line 39, in <module>
    cfg = proj.analyses.CFG(fail_fast=True, starts=[main.addr], initial_state=start_state, context_sensitivity_level=3, keep_state=True, call_depth=5)
  File "/usr/local/lib/python2.7/dist-packages/angr/analysis.py", line 87, in make_analysis
    oself.__init__(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/angr/analyses/cfg_accurate.py", line 135, in __init__
    self._analyze()
  File "/usr/local/lib/python2.7/dist-packages/angr/analyses/forward_analysis.py", line 121, in _analyze
    self._handle_entry(entry)
  File "/usr/local/lib/python2.7/dist-packages/angr/analyses/forward_analysis.py", line 144, in _handle_entry
    self._pre_entry_handling(entry, _locals)
  File "/usr/local/lib/python2.7/dist-packages/angr/analyses/cfg_accurate.py", line 933, in _pre_entry_handling
    simrun, error_occurred, _ = self._get_simrun(addr, path, current_function_addr=func_addr)
  File "/usr/local/lib/python2.7/dist-packages/angr/analyses/cfg_accurate.py", line 2252, in _get_simrun
    sim_run = self.project.factory.sim_run(current_entry.state, jumpkind=jumpkind)
  File "/usr/local/lib/python2.7/dist-packages/angr/factory.py", line 131, in sim_run
    r = self.sim_block(state, addr=addr, **block_opts)
  File "/usr/local/lib/python2.7/dist-packages/angr/factory.py", line 77, in sim_block
    last_stmt=last_stmt)
  File "/usr/local/lib/python2.7/dist-packages/simuvex-4.6.3.15-py2.7.egg/simuvex/vex/irsb.py", line 33, in __init__
    SimRun.__init__(self, state, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/simuvex-4.6.3.15-py2.7.egg/simuvex/s_run.py", line 18, in __init__
    self.state = self.initial_state.copy()
  File "/usr/local/lib/python2.7/dist-packages/simuvex-4.6.3.15-py2.7.egg/simuvex/s_state.py", line 329, in copy
    c_plugins = self._copy_plugins()
  File "/usr/local/lib/python2.7/dist-packages/simuvex-4.6.3.15-py2.7.egg/simuvex/s_state.py", line 318, in _copy_plugins
    return { n: p.copy() for n,p in self.plugins.iteritems() }
  File "/usr/local/lib/python2.7/dist-packages/simuvex-4.6.3.15-py2.7.egg/simuvex/s_state.py", line 318, in <dictcomp>
    return { n: p.copy() for n,p in self.plugins.iteritems() }
RuntimeError: dictionary changed size during iteration
zardus commented 8 years ago

Interesting...

It looks like a brand new plugin is being created while you're copying, either regs (though then printing rax should have no effect; just grabbing the regs plugin would cause the bug) or registers (when you're accessing regs.rax, though it's surprising that it doesn't already exist). Could you print out state.plugins.keys() at every step of the process (before copying, at the beginning of SimStateTest.copy(), and at the end of SimStateTest.copy())? I think that'll let us track it down pretty quickly.

axt commented 8 years ago

BEFORE ['solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix'] AFTER ['log', 'solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']

axt commented 8 years ago

With just print regs.ip in the step function. The 'log' plugin is missing in the first step.

BEFORE ['solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
<BV32 0x804841d>
AFTER ['solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
BEFORE ['log', 'solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
<BV32 0x804842b>
AFTER ['log', 'solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
BEFORE ['log', 'solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
<BV32 0x804842d>
AFTER ['log', 'solver_engine', 'scratch', 'regs', 'registers', 'procedure_data', 'memory', 'test', 'posix']
[...]
zardus commented 8 years ago

Ah, interesting... Your access of rax was logging a SimAction, which, in turn, created the log plugin. ip is explicitly filtered out from action logging, IIRC, so in the later case, the log was probably created in the course of processing the actual basic block, rather than during copying.

We have a few potential ways forward:

  1. Use items instead of iteritems in that dict comprehension in SimState.copy. That's probably the easiest fix, though it feels a tad hackish.
  2. Create the log plugin when we create the initial state in angr.SimOS, but that's also a bandaid because this issue can occur with plugins other than log, of course.
  3. Turn off action logging when we're not actively processing basic blocks or SimProcedures. This has the benefit of addressing the problem where SimInspect breakpoints create unwanted SimActions (though, sometimes, they are wanted), but that's a whole other thing. And it also still doesn't fix the issue for plugins other than log.

I guess the first option seems best?

axt commented 8 years ago

With option 1, since we will only copy the plugins which existed before call to the copy function, will we not loose the state of the on-demand created plugin from the target state?

Regarding the other question: in a long term solution, i think it could be great if you could choose that your access to the registers, etc should be reflected in SimActions or not. Don't know what is the right way to do it, wouldn't be nice to touch the API because of this. Maybe something like:

with SimActions.transparent():
   print self.state.regs.eax
axt commented 8 years ago

I've tried to understand the plugins code more deeper since then. I see now, that probably its not a valid use-case to access the registers in the copy function. I did it only because I wanted to see how the state propagates. So maybe there is no fix needed at all for this.

zardus commented 8 years ago

We'll lose the state, but since we copy the plugins sequentially anyways, we'd lose the state of the earlier ones if the later ones modify them afterwards.

You're right in that complex operations aren't intended to be done during state copying, but I'm not sure if we want to have it explicitly forbidden. Might be good to do a best-effort support of it, at least using iter instead of iteritems (until Python 3 support rolls around).

In terms of the context manager for turning off actions, I think that'd be really cool, state copying aside! We already have state.with_condition(), and adding some sort of state.with_options() or state.without_options() would be really cool as well. We should keep that in mind when there's a free minute to implement it!