AlexAltea / capstone.js

Capstone disassembler framework for JavaScript
https://alexaltea.github.io/capstone.js/
BSD 3-Clause "New" or "Revised" License
169 stars 29 forks source link

Capstone.disasm() offset - Higher part is ignored for offsets larger than 32-bit #15

Open eternaleclipse opened 2 years ago

eternaleclipse commented 2 years ago

Expected behavior

On recent Python capstone (4.0.2):

from capstone import *

CODE = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"

md = Cs(CS_ARCH_X86, CS_MODE_64)
for i in md.disasm(CODE, 0x10000001234):
    print("0x%x:\t%s\t%s" %(i.address, i.mnemonic, i.op_str))

Output:

0x10000001234:  push    rbp
0x10000001235:  mov     rax, qword ptr [rip + 0x13b8]

I would expect the same on JavaScript.

Problem

Using x86-capstone.js from last release (capstone.js v3.0.5-RC1 - from 2017...):

var buffer = [0x55, 0x31, 0xD2, 0x89, 0xE5, 0x8B, 0x45, 0x08];
var offset = 0x10000001234;

var d = new cs.Capstone(cs.ARCH_X86, cs.MODE_32);
var instructions = d.disasm(buffer, offset);

instructions.forEach(instr => console.log(`0x${instr.address.toString(16)}: ${instr.mnemonic} ${instr.op_str}`));

d.close();

Output:

"0x1234: push ebp"
"0x1235: xor edx, edx"
"0x1237: mov ebp, esp"
"0x1239: mov eax, dword ptr [ebp + 8]"

I assume this is probably due to outdated build not supporting 64-bit offsets.

AlexAltea commented 2 years ago

Yes, 64-bit offsets are not supported yet on Capstone.js.

On Unicorn.js, I used the following wrappers to get arbitrary precision on the API though. It would be a good idea to apply it here: https://github.com/AlexAltea/unicorn.js/blob/master/src/libelf-integers.js

eternaleclipse commented 2 years ago

I'm thinking about solving this using BigInt.

I'll PR if I get any progress on this 🙂

AlexAltea commented 2 years ago

Sure, that makes sense. I didn't know that BigInt landed on major browsers, thanks!