keystone-engine / keystone

Keystone assembler framework: Core (Arm, Arm64, Hexagon, Mips, PowerPC, Sparc, SystemZ & X86) + bindings
http://www.keystone-engine.org
GNU General Public License v2.0
2.31k stars 459 forks source link

incorrect result with sym_resolver #351

Open Himyth opened 6 years ago

Himyth commented 6 years ago

to testing sym_resolver interface, I wrote the following script:

def resolver(sym, addr):
     if sym == 'L2':
         addr[0] = 0xdead
         return True
     return False
ks = Ks(KS_ARCH_X86, KS_MODE_32)
ks.sym_resolver = resolver
print ks.asm("jmp L2", addr=0, as_bytes=True)

while we want is "jmp 0xdead", current keystone engine will give "\xe9\xac\xde\x00\x00" as result, which is "jmp 0xdeb1". this is not what we want, I think the cause is in https://github.com/keystone-engine/keystone/blob/master/llvm/lib/MC/MCAssembler.cpp

  if (const MCSymbolRefExpr *A = Target.getSymA()) {
    const MCSymbol &Sym = A->getSymbol();
    bool valid;
    if (Sym.isDefined()) {
      Value += Layout.getSymbolOffset(Sym, valid);
      if (!valid) {
        KsError = KS_ERR_ASM_FIXUP_INVALID;
        return false;
      }
    } else {
        // a missing symbol. is there any resolver registered?
        if (KsSymResolver) {
            uint64_t imm;
            ks_sym_resolver resolver = (ks_sym_resolver)KsSymResolver;
            if (resolver(Sym.getName().str().c_str(), &imm)) {
                // resolver handled this symbol
                Value = imm;
                IsResolved = true;
            } else {
                // resolver did not handle this symbol
                KsError = KS_ERR_ASM_SYMBOL_MISSING;
                return false;
            }
        } else {
            // no resolver registered
            KsError = KS_ERR_ASM_SYMBOL_MISSING;
            return false;
        }
    }
  }

after resolver finding the address, it assign the imm to Value directly. but the other branch where symbol is not missing, it will do "Value += Layout.getSymbolOffset(Sym, valid);", this is where the difference is. maybe it should be "Value += imm" when using resolver?

I tried recompile keystone with the patch, and now it gives ('\xe9\xa8\xde\x00\x00', 1L) as result, which is "jmp 0xdead".

maybe a fix?

iflody commented 6 years ago

Sure, I met the same bug, please fix it! Thank you!

sasdf commented 6 years ago

Your patch also fix the issue when using x64 relative addressing (e.g. lea rdi, [rip + label]) with symbol resolver. Thanks a lot!

AlexAltea commented 4 years ago

Same bug.