shellphish / driller

Driller: augmenting AFL with symbolic execution!
BSD 2-Clause "Simplified" License
891 stars 162 forks source link

AttributeError: startswith #25

Open anon8675309 opened 7 years ago

anon8675309 commented 7 years ago

I'm back, and this time I'm drilling a binary which doesn't need any LD_PRELOAD junk, it's just a normal executable. This time I eventually get what looks to be a type confusion bug (b/c it looks like "name" should be a string, more specifically a filename), however it looks like the real problem might have happened quite a bit earlier...

<snip>
DEBUG   | 2017-03-21 17:47:29,144 | driller.Driller | found 409e50 -> 409e64 transition
DEBUG   | 2017-03-21 17:47:29,144 | driller.Driller | 409e50 -> 409e64 has already been encountered
ERROR   | 2017-03-21 17:47:29,339 | tracer.Tracer | the dynamic trace and the symbolic trace disagreed
ERROR   | 2017-03-21 17:47:29,339 | tracer.Tracer | [../targets/tcpdump-4.9.0_non-instumented/tcpdump] dynamic [0x409e64], symbolic [0x409ec7]
ERROR   | 2017-03-21 17:47:29,339 | tracer.Tracer | inputs was 'id:0'
ERROR   | 2017-03-21 17:47:29,339 | tracer.Tracer | TracerMisfollowError encountered
WARNING | 2017-03-21 17:47:29,340 | tracer.Tracer | entering no follow mode
DEBUG   | 2017-03-21 17:47:29,397 | driller.Driller | found 20bd070 -> 20bd3d8 transition
DEBUG   | 2017-03-21 17:47:29,397 | driller.Driller | 20bd070 -> 20bd3d8 has already been encountered
<snip>
DEBUG   | 2017-03-21 17:47:37,427 | driller.Driller | found 2079a40 -> 2079ac0 transition
DEBUG   | 2017-03-21 17:47:37,427 | driller.Driller | 2079a40 -> 2079ac0 has already been encountered
CRITICAL | 2017-03-21 17:47:37,449 | drill_it | startswith
Traceback (most recent call last):
  File "./drill_it.py", line 101, in <module>
    solutions = d.drill()
  File "/home/anon/.virtualenvs/driller/local/lib/python2.7/site-packages/driller/driller.py", line 110, in drill
    list(self._drill_input())
  File "/home/anon/.virtualenvs/driller/local/lib/python2.7/site-packages/driller/driller.py", line 194, in _drill_input
    branches = t.next_branch()
  File "/home/anon/.virtualenvs/driller/local/lib/python2.7/site-packages/tracer-0.1-py2.7.egg/tracer/tracer.py", line 339, in next_branch
    self.path_group = self.path_group.step(size=bbl_max_bytes)
  File "build/bdist.linux-x86_64/egg/angr/path_group.py", line 540, in step
    pg = pg._one_step(stash=stash, selector_func=selector_func, successor_func=successor_func, check_func=check_func, **kwargs)
  File "build/bdist.linux-x86_64/egg/angr/path_group.py", line 322, in _one_step
    r = self._one_path_step(a, successor_func=successor_func, check_func=check_func, **kwargs)
  File "build/bdist.linux-x86_64/egg/angr/path_group.py", line 212, in _one_path_step
    successors = a.step(**kwargs)
  File "build/bdist.linux-x86_64/egg/angr/path.py", line 205, in step
    self._make_successors(throw=throw)
  File "build/bdist.linux-x86_64/egg/angr/path.py", line 238, in _make_successors
    self._run = self._project.factory.successors(self.state, **self._run_args)
  File "build/bdist.linux-x86_64/egg/angr/factory.py", line 77, in successors
    r = engine.process(state, inline=inline,**kwargs)
  File "build/bdist.linux-x86_64/egg/angr/engines.py", line 88, in process
    ret_to=ret_to)
  File "build/bdist.linux-x86_64/egg/angr/engines.py", line 143, in process
    force_addr=force_addr)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/engines/procedure.py", line 34, in process
    force_addr=force_addr)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/engines/engine.py", line 44, in process
    self._process(new_state, successors, *args, **kwargs)
  File "build/bdist.linux-x86_64/egg/angr/engines.py", line 173, in _process
    return super(SimEngineHook, self)._process(state, successors, procedure, **kwargs)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/engines/procedure.py", line 71, in _process
    procedure.execute(state, successors, ret_to=ret_to)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/s_procedure.py", line 145, in execute
    r = run_func(*sim_args, **self.kwargs)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/procedures/syscalls/open.py", line 13, in run
    return self.state.posix.open(path, flags)
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/plugins/posix.py", line 168, in open
    elif self.concrete_fs and not os.path.abspath(name).startswith("/dev"):
  File "/home/anon/.virtualenvs/driller/lib/python2.7/posixpath.py", line 360, in abspath
    if not isabs(path):
  File "/home/anon/.virtualenvs/driller/lib/python2.7/posixpath.py", line 54, in isabs
    return s.startswith('/')
  File "/home/anon/.virtualenvs/driller/lib/python2.7/site-packages/simuvex-6.7.1.31-py2.7.egg/simuvex/s_action_object.py", line 87, in __getattr__
    f = getattr(self.ast, attr)
  File "build/bdist.linux-x86_64/egg/claripy/ast/base.py", line 931, in __getattr__
    raise AttributeError(a)
AttributeError: startswith

I included the error which is where we seem to start going off the rails. I say this because my base address is 0x400000 and all the shared objects are above that... 20bd070 is unmapped memory (according to gdb when I manually run/debug this binary outside the whole Driller ecosystem). So I took a closer look at the instruction where this transition error occurred. Here's the relevant code:

0000000000409e50 <gmt2local>:
  409e50:       41 55                   push   r13
  409e52:       41 54                   push   r12
  409e54:       55                      push   rbp
  409e55:       53                      push   rbx
  409e56:       48 83 ec 18             sub    rsp,0x18
  409e5a:       48 85 ff                test   rdi,rdi
  409e5d:       48 89 7c 24 08          mov    QWORD PTR [rsp+0x8],rdi
  409e62:       74 5c                   je     409ec0 <gmt2local+0x70>

When I put a breakpoint on *0x409e50 and run the app in gdb, rdi is zero and so the jump should be taken and we should be at 0x409ec0. This is where gdb goes. The dynamic appears to go to 0x409e64 (doesn't take the jump), which seems wrong. This breakpoint is only hit once, and when I look at the stacktrace it came from tcpdump.c:1533 which is timezone_offset = gmt2local(0);. Ergo we know that this will never go to 0x409e64. Furthermore, this is the only place in the source where gmt2local() is ever called, which means if we got here an rdi isn't zero, something has gone terribly wrong.

For the symbolic side, it went to 0x409ec7, which is in the same neighborhood as 0x409ec0, but not quite right either. For context, here's the jump target:

   0x0000000000409ec0 <+112>:   xor    edi,edi
   0x0000000000409ec2 <+114>:   call   0x402ac0 <time@plt>
   0x0000000000409ec7 <+119>:   mov    QWORD PTR [rsp+0x8],rax
   0x0000000000409ecc <+124>:   jmp    0x409e64 <gmt2local+20>

So I think we have some important questions about the dynamic and symbolic targets, however the biggest question of all is: How on Earth did we get to 0x2079a40?!

After scanning back in the logs a little bit, I noticed that this is not the first time we've been in the 2000000 range. Here's the first time I start seeing that address. So maybe this is some expected artifact of how driller does the dynamic execution...?

DEBUG   | 2017-03-21 17:47:27,656 | driller.Driller | found 404e16 -> 404e19 transition
DEBUG   | 2017-03-21 17:47:27,656 | driller.Driller | 404e16 -> 404e19 has already been encountered
DEBUG   | 2017-03-21 17:47:27,676 | driller.Driller | found 404e16 -> 404e16 transition
DEBUG   | 2017-03-21 17:47:27,677 | driller.Driller | 404e16 -> 404e16 has already been encountered
DEBUG   | 2017-03-21 17:47:27,759 | driller.Driller | found 208c850 -> 208ca70 transition
DEBUG   | 2017-03-21 17:47:27,760 | driller.Driller | 208c850 -> 208ca70 has already been encountered
DEBUG   | 2017-03-21 17:47:27,799 | driller.Driller | found 208c875 -> 208c896 transition
DEBUG   | 2017-03-21 17:47:27,799 | driller.Driller | 208c875 -> 208c896 has already been encountered
DEBUG   | 2017-03-21 17:47:27,892 | driller.Driller | found 208c8b0 -> 208c940 transition
DEBUG   | 2017-03-21 17:47:27,893 | driller.Driller | 208c8b0 -> 208c940 has already been encountered
DEBUG   | 2017-03-21 17:47:27,913 | driller.Driller | found 208c924 -> 208ca60 transition
DEBUG   | 2017-03-21 17:47:27,913 | driller.Driller | 208c924 -> 208ca60 has already been encountered
DEBUG   | 2017-03-21 17:47:27,964 | driller.Driller | found 404e3c -> 405762 transition
DEBUG   | 2017-03-21 17:47:27,964 | driller.Driller | 404e3c -> 405762 has already been encountered

In any case, I could use some help with some context here so I can get to the bottom of this and get the issue fixed, whatever it might be.

anon8675309 commented 7 years ago

Update: I tried including the patch to exclude angr internals from path exploration (https://github.com/shellphish/driller/pull/8) thinking it might avoid this 2000000 range stuff, but it did not make any difference.

rhelmot commented 7 years ago

Hey, so if you're gonna start using angr on real world programs you will run into the second-most serious problem endemic to all symbolic execution engines out there - environment support. We've put a hell of a lot of effort into trying to make angr's execution match a real machine's, but there's an innumerable number of ways this can fall apart, including errors or unsupported elements in the loading process, inconsistencies in syscall emulation, edge cases in file handling, cpuid (god forbid), the list goes on and on. If you want to make angr applicable to real programs, right now the unfortunate state is that you really need to be able to start debugging angr and its internals to figure out where inconsistencies start cropping up. In this case, I don't think the internal addresses (those are mapped by project._extern_obj and project._syscall_obj, which are custom CLE backends mapped into the program's memory space to provide addresses for simprocedure hooks and syscalls) are what's tripping you up here, it's probably some more fundamental emulation inconsistency.

My toolkit for debugging misfollows usually looks something like this:

Finally, I notice that the particular misfollow is near the time() function, which has always been a stickler. If you dump the symbols from libc and look at time, you'll see that it's not a normal function type, it's an IFUNC, which is a special symbol which actually points to a function which will dynamically determine the correct pointer to that function at runtime, usually based on cpuid, and then return that. It requires some cooperation from the dynamic linker to do this, and angr/cle cooperate as best they can, but it's a bit of a hack. The code to deal with this is angr/simos.py. It looks like this code in particular was written before the introduction of project.hook_symbol... Maybe switching it to that would make things easier.

anon8675309 commented 7 years ago

project.loader.whats_at() is the best. The 0x200* turned out to be libc.so, and I also found angr syscalls and angr extrns in there.

What's a reasonable way to drop into a python console when path exploration gets to a particular place in the target binary? Right now I'm doing something unreasonable. I'm hacking in an if current.addr == 0x409ec7: raise Exception("Horrible hack to pop a debugger") in tracer.next_branch() and then running ipython -i tcpdump.py (tcpdump.py being a hacked together file used only for debugging this one problem). Can I accomplish this somehow with hooks? I'll still want to hand jam python code to poke around, but having everything in a file makes it easy to start over and get repeatable runs.

Finally, I'm seeing some weird stuff here.

In [22]: ["%#x" % x for x in current.addr_trace.hardcopy]
Out[22]: ['0x409e50']
In [23]: print(current.callstack.dbg_repr())
0 | 0x4057d6 -> 0x409e50, returning to 0x4057dd
1 | 0x5000080 -> 0x404db0, returning to 0x5000080
2 | 0x406900 -> 0x4029a0, returning to 0x406929
3 | None -> 0x5000350, returning to -0x1

The stacktrace all looks fine. It's ang_externs -> _start -> __libc_start_main -> main -> gmt2local (the function I care about). However, what's going on with the addr_trace? Shouldn't that be showing me all the basic blocks which were executed on the way to where we are now? If not, where can I get that info?

I think I'm starting to get the hang of tracking these things down, so hopefully I'll get to the bottom of this and we can get this issue closed out.

rhelmot commented 7 years ago

If you want to launch a debug shell using the hack method you described, import ipdb; ipdb.set_trace() is the standard. ipdb is an extension of pdb, which is a gdb-style debug shell for python.

The non-hacky way is state.inspect.b('instruction', instruction=0x1234, action=what), where what can be a function to call or the strings "ipython" or "ipdb" to launch either of those shells. Read about the SimInspect breakpoint stuff here.

I have no idea what's going on with your address trace, it definitely should be showing more addresses than that.