angr / angr

A powerful and user-friendly binary analysis platform!
http://angr.io
BSD 2-Clause "Simplified" License
7.46k stars 1.07k forks source link

Empty IRSB passed to SimIRSB for thumb instruction #83

Open nshp opened 8 years ago

nshp commented 8 years ago

I'm seeing this error when trying to step through one particular instruction (below):

import StringIO
import archinfo
import logging
import pyvex
import angr
import cle

from capstone import *

logging.getLogger('angr.lifter').setLevel(logging.DEBUG)

md = Cs(CS_ARCH_ARM, CS_MODE_THUMB)

CODE = '\x96\xe8\x0a\xe0'
ARCH = archinfo.ArchARMEL()

print '\n'.join(i.mnemonic + ' ' + i.op_str for i in md.disasm(CODE, 0x1))
# -> ldm.w r6, {r1, r3, sp, lr, pc}

# Manually loading the IRSB seems to work?
irsb = pyvex.IRSB(CODE, 0x1, ARCH, bytes_offset=1, num_bytes=4, traceflags=0)
irsb.pp()

ld = cle.Loader(StringIO.StringIO(CODE), main_opts={'custom_arch': ARCH})
b = angr.Project(ld)

# Blank state in thumb mode
state = b.factory.blank_state(addr=0x1)

try: print b.factory.sim_block(state)
except Exception as e:
    logging.exception(e)
# Traceback (most recent call last):
#   File "angrtest.py", line 22, in <module>
#     print b.factory.sim_block(state)
#   File "/home/wrk/.local/lib/python2.7/site-packages/angr/factory.py", line 77, in sim_block
#     last_stmt=last_stmt)
#   File "/home/wrk/.local/lib/python2.7/site-packages/simuvex/vex/irsb.py", line 37, in __init__
#     raise SimIRSBError("Empty IRSB passed to SimIRSB.")
# simuvex.s_errors.SimIRSBError: Empty IRSB passed to SimIRSB.

# Wait... what? This is different from above?
irsb = pyvex.IRSB(CODE, 0x1, ARCH, bytes_offset=1, num_bytes=4, traceflags=0)
irsb.pp()

There's something else weird going on here - IDA refuses to disassemble that instruction, while both objdump and capstone disassemble it as above.

Edit: GAS also won't assemble that instruction, because "LR and PC should not both be in register list." Hmm.

nshp commented 8 years ago

I encountered this again with opcode \xb8\xe8N\xb0. More interestingly, I noticed the IRSB pyvex produces actually changes after calling sim_block somehow. Updating the code snippet above to show that.

Moar edit: It changes because Lifter.lift calls pyvex.set_iropt_level(1). Only with opt level 0 does this IRSB contain statements.

o_O

nshp commented 8 years ago

Tracked this down to priv/guest_arm_toIR.c:19294 in vex (for the second opcode)

      if (rL13 == 1)
         valid = False;

In other words, if SP in Thumb LDM/STM reglist, instruction is invalid, bork. That's fair for both of these instructions in armv7 -- they are technically deprecated. They sure do work on hardware though, and compilers still emit them. :-1:

In any case, this looks like a problem with upstream vex, so feel free to close this.

ltfish commented 8 years ago

@nshp Thanks for figuring this out!

@zardus Do we want to patch vex for it? Personally I'm down to close the issue for now.

tyb0807 commented 7 years ago

Hello,

I have found the same issue when trying to make a symbolic execution on ARM thumb2 binary. `def sym_exec_test(): p = angr.Project('bin/code_ptc', load_options={'auto_load_libs': False})

 # main() address
 func_addr = 0x10300 # Thumb mode

 # g_userPin address
 tgt1_addr = 0x20714 # Thumb mode

 # g_ptc address
 tgt2_addr = 0x20710 # Thumb mode

 tgt1 = claripy.BVS('g_userPin', 32)
 tgt2 = claripy.BVS('g_ptc'    ,  8)

 irsb = p.factory.block(func_addr).vex
 irsb.pp()

 # Here's the output:
 # IRSB {
 # t0:Ity_I32
 # 00 | ------ IMark(0x10300, 0, 0) ------
 # NEXT: PUT(pc) = 0x00010300; Ijk_NoDecode 
 # }

 s = p.factory.blank_state(addr=func_addr)
 print s

 # Here's the output:
 # <simuvex.s_state.SimState object at 0x7f23d25faa00>

 sirsb = p.factory.sim_block(s)
 print sirsb

# I got the same error here: raise SimIRSBError("Empty IRSB passed to SimIRSB.")

` And here's what's at address 0x10300 (getting this from objdump):

10300: b508 push {r3, lr}

Furthermore, I can do CFGAccurate analysis on it, but it throws this error on symbolic execution. Any idea why? These 2 analyses both work on SimIRSBs, don't they? I read from #71 that angr uses capstone to lift thumb blocks for CFG, does it mean that another approach is used for symbolic execution? Can anyone give me a hint to debug this please?

ltfish commented 7 years ago

Please try to use an odd address for Thumb mode code. For example, instead of 0x10300, try 0x10301,

tyb0807 commented 7 years ago

Thanks for your prompt reply. It worked. I've read about this odd-even address for ARM thumb, does it just mean that there's no bug here? Can we close this issue?

ltfish commented 7 years ago

What you see is not the bug reported in this thread. I won't close it before we get to patch VEX to resolve the underlying issue.