AlexAltea / unicorn.js

Unicorn CPU emulator framework port for JavaScript
https://alexaltea.github.io/unicorn.js/
GNU General Public License v2.0
573 stars 37 forks source link

HOOK_CODE causing uncaught exception: abort() #12

Closed typoon closed 7 years ago

typoon commented 7 years ago

Hi,

So I have the following code that is trying to do a HOOK_CODE on every instruction. If you run it, you will see that it causes an uncaught exception to be thrown and the script aborts. If you remove the push eax instruction from the code, it works as expected. If you put the push eax back and remove any code after it, it also works. The issue is having an instruction after the push instruction.

Any idea on what might be causing this?

Here is the whole error message I am getting:

uncaught exception: abort() at jsStackTrace@file:///tmp/unicorn.js/unicorn-x86.min.js:5:18821
stackTrace@file:///tmp/unicorn.js/unicorn-x86.min.js:5:18992
abort@file:///tmp/unicorn.js/unicorn-x86.min.js:26:7283
_abort@file:///tmp/unicorn.js/unicorn-x86.min.js:5:216266
cg@file:///tmp/unicorn.js/unicorn-x86.min.js:15:128565
Uc@file:///tmp/unicorn.js/unicorn-x86.min.js:13:112318
GQ@file:///tmp/unicorn.js/unicorn-x86.min.js:11:172100
invoke_iiiiiiii@file:///tmp/unicorn.js/unicorn-x86.min.js:5:296738
sI@file:///tmp/unicorn.js/unicorn-x86.min.js:17:112375
mg@file:///tmp/unicorn.js/unicorn-x86.min.js:16:34727
lg@file:///tmp/unicorn.js/unicorn-x86.min.js:16:34213
dO@file:///tmp/unicorn.js/unicorn-x86.min.js:11:79104
ccallFunc@file:///tmp/unicorn.js/unicorn-x86.min.js:5:9901
uc.Unicorn/this.emu_start@file:///tmp/unicorn.js/unicorn-x86.min.js:310:23
load_code@file:///tmp/unicorn.js/error.html:44:5
@file:///tmp/unicorn.js/error.html:71:1

If this abort() is unexpected, build with -s ASSERTIONS=1 which can give more information.

And the code to reproduce it:

<textarea id="asm_code">
    mov eax,100
    mov eax,102
    push eax
    mov ebx,103
</textarea>

<script src='unicorn-x86.min.js'></script>
<script src='keystone-x86.min.js'></script>

<script>
var MEMORY_SIZE = 4096;
var INITIAL_MEMORY = 0xa0000000 >>> 0;

// Engines
var my_uc;
var my_ks;
var log;

function load_code() {
    my_uc = new uc.Unicorn(uc.ARCH_X86, uc.MODE_32);
    my_ks = new ks.Keystone(ks.ARCH_X86, ks.MODE_32);

    var addr = INITIAL_MEMORY;
    var asm_code = document.getElementById("asm_code").value;

    my_ks.option(ks.OPT_SYNTAX, ks.OPT_SYNTAX_INTEL);
    code = my_ks.asm(asm_code);

    // Write registers and memory
    // 4k of memory for now
    my_uc.mem_map(addr, MEMORY_SIZE, uc.PROT_ALL);
    my_uc.mem_write(addr, code);

    // Setup stack
    my_uc.reg_write_i32(uc.X86_REG_ESP, addr + MEMORY_SIZE);

    // Setup hooks
    my_uc.hook_add(uc.HOOK_CODE, hook_code_2, "some data", 1, 0);

    // Start emulator
    var begin = addr;
    var until = addr + code.length;
    my_uc.emu_start(begin, until, 0, 0);

}

function hook_code_2(my_uc, addr_lo, addr_hi, size, used_data) {
    var address = addr_hi << 16 | addr_lo;

    var bytes = my_uc.mem_read(address, size);
    var binary = Array.from(bytes).map(toHex);

    var current_instruction = {
        "address": address,
        "size": size,
        "bytes": bytes,
        "binary_str": binary,
        "asm": 'return asmcode here... need capstone for that',
    }

    console.log("inside hook_code_2");
    console.log(current_instruction);
}

function toHex(n) {
    return n.toString(16).padStart(2, '0');
}

load_code();

</script>

Thanks!

AlexAltea commented 7 years ago

Just for the record. If you add the flags -s ASSERTIONS=2 -g4 to the build script [1], you will get more useful debug information. Also, -s SAFE_HEAP=1 is useful as well.

[1] https://github.com/AlexAltea/unicorn.js/blob/8a0e6bfc/build.py#L550-L553

AlexAltea commented 7 years ago

There's a couple of mistakes with your code (the latter is actually my fault due to lack of proper documentation):

  1. EDIT: No, this is not the case. SP in X86 is decrement-then-write. Apologies for the brain fart. The stack pointer is wrongly computed. You place your stack at 0xA0000000 with size 4096, that means only addresses [0xA0000000, 0xA0000FFF] can be accessed. However, you set ESP to 0xA0001000: When push eax is executed, bytes [0xA0001000, 0xA0001003] are written which results in an unmapped memory read.

  2. Arguments addr_hi and addr_ho are used only when dealing with 64-bit addresses. Each one is 32-bit in size (so the << 16 part in hook_code_2 is wrong), but since JavaScript doesn't support 64-bit integer types natively you cannot really do addr_hi << 32 | addr_lo either. Since your code only uses 32-bit addresses, use only addr_lo. Sorry for not clearly stating what these arguments were in the first place.

    -var address = addr_hi << 16 | addr_lo;
    +var address = addr_lo;

Solving these two mistakes doesn't fix the issue though. I'll keep debugging this.

typoon commented 7 years ago

Hi,

Thanks for getting back to me on this.

So, the first issue regarding the stack addressing. I believe this might not be the case, since the stack grows downward on x86. When I do the push, the ESP is decremented, so it in fact will write the bytes 0xa0000FFF to 0xa0000FFC. I tried to troubleshoot the same way, by subtracting some amount from the base just to be safe, but noticed that the error would still persist.

Regarding the second issue, you are right. addr_hi in this case is always 0, so it ends up that the logic I have there is pretty much the same as writing var address = addr_lo.

As you pointed, these still do not fix the issue.

Sorry I didn't provide more details with the issue before, I didn't have the chance to build unicorn.js locally yet in order to add the flags that would give better details. Would it be possible to have a debug version of the .js file being served in the repository together with the production ones? That would facilitate a lot when this kind of issues happen, as anyone reporting issues could run them through the debug build and provide more details or maybe even figure out how to fix the issue themselves.

Thanks for putting the time into this!

AlexAltea commented 7 years ago

Apologies for the brain fart. Indeed, ESP is decrement-then-write, not the other way around, so your code was correct. There's still another small thing to take into account:

  1. Keystone.js API changed recently to address https://github.com/AlexAltea/keystone.js/issues/1#issuecomment-280322369, so the return value of my_ks.asm(...) might differ depending on whether you are using <= v0.91 releases, or the latest master build. If you are using the latter, you need to do following changes.
    -code = my_ks.asm(asm_code);
    +code = my_ks.asm(asm_code).mc;

But another abort() is triggered after fixing this, this time presumably the one you are reporting. I'll debug it now. :-)

Would it be possible to have a debug version of the .js file being served in the repository together with the production ones?

Makes sense. I'll put such a version online this evening. Thanks for the suggestion.

typoon commented 7 years ago

Oh, I am using the 0.9.1 release, so that did not affect me. But thanks for the pointer, will keep it in mind after updating to the latest master release.

AlexAltea commented 7 years ago

Fixed at 65eaa52fcb4351375623be45e325e7352cab6178 and 57d0397a67dfa9eb33234876926f6e63ca93e7c6. The dist folder contains the updated versions. I will now update the release files. :-)

PS: Too tired now to include a debug version. If this situation happens too often I'll do it though.

typoon commented 7 years ago

I tested the version in the dist/ folder and it seems to be working. Thanks for the help!