angr / angr

A powerful and user-friendly binary analysis platform!
http://angr.io
BSD 2-Clause "Simplified" License
7.46k stars 1.07k forks source link

Concolic Execution with Angr #2716

Closed shouc closed 2 years ago

shouc commented 3 years ago

Following https://lists.cs.ucsb.edu/pipermail/angr/2018-September/000496.html where this code is used for concolic execution:

class ConcolicProgram:

    def __init__(self, exe):
        self.arg = symbolic BVs
        addcons(self.arg) # add constraints to let BVs = concrete value
        self.initial_state = self.project.factory.entry_state(
            args=[self.exe, self.arg],
            add_options=angr.options.unicorn,
        )
        self.exe = exe
        self.project = angr.Project(exe, load_options={'auto_load_libs': False})

    def run(self):
        state = self.initial_state
        while True:
            succ = state.step()
            print(hex(succ.addr), state.solver.constraints)
            if len(succ.successors) > 1:
                raise Exception('more successors %d' % len(succ.successors))
            if not succ.successors: return state
            state, = succ.successors

However, this code is not really running the program concretely. It is still simulating. When the binary reads file/use getenv, the code above would just fail.

What we want is that given a binary and an input, angr could let binary concretely execute the input and in the meantime, dump the constraints with the addr (PC) that leads to the constraints. I checked tracing mode but I could not find any useful documentation. Would it be possible for you to provide an example for this kind of task?

Thanks

rhelmot commented 3 years ago

The specific thing you want, to instrument a real version of the code running natively to dump the constraints, is not implemented in angr. We generally don't believe it's worth the engineering effort. Instead, our usual technique for these sort of tasks is to dump a trace of the target program, for example, the basic block trace using qemu, and then to simulate this run in angr using some tool to follow the trace, for example, using the Tracer exploration technique. This enables us to work around some (most?) of the environment modeling issues by only modeling the parts of the environment that affect control flow and letting the rest be modeled by the default symbolic stubs (these can still affect control flow, but as long as they only introduce symbolic guard conditions the tracer will figure out how to constrain them based on the trace).

One thing you can do to improve environment modeling is to start your trace from a memory dump.

github-actions[bot] commented 2 years ago

This issue has been marked as stale because it has no recent activity. Please comment or add the pinned tag to prevent this issue from being closed.

github-actions[bot] commented 2 years ago

This issue has been closed due to inactivity.