AlexAltea / unicorn.js

Unicorn CPU emulator framework port for JavaScript
https://alexaltea.github.io/unicorn.js/
GNU General Public License v2.0
566 stars 36 forks source link

Program counter skips every thousand bytes #8

Closed flowergrass closed 7 years ago

flowergrass commented 7 years ago

Hey, We're working on a MicroPython emulator in the browser using unicorn.js with the ARM Thumb instruction set. The binaries run well using Unicorn in C, but porting to javascript yielded strange bugs. On a BL instruction (f000 fd27) the PC goes to 0x9fa2 rather than a proper address (0x800ed54). In the document below we've recreated a similar bug. Every 16-bit Thumb instruction increments R0 and the first emu_start() call works correctly. So also does the second call. However the third instruction which reaches the 1024th byte sets the PC to the magical 9fa2.

Run the below example uncommenting the three lines to see the different behaviour.

<!DOCTYPE html>
  <body>
    <script src="unicorn-arm.min.js"></script>
    <script>
    var addr = 0x0;
    var size = 0x1000;
    var code = new Uint8Array(0x800);
    for (var i = 0; i < code.length; i += 2) {
      code[i] = 0x40;
      code[i + 1] = 0x1c;
    }

    var e = new uc.Unicorn(uc.ARCH_ARM, uc.MODE_THUMB);
    e.reg_write_i32(uc.ARM_REG_R0, 0x0);
    e.mem_map(addr, size, uc.PROT_ALL);
    e.mem_write(addr, code);

    try {
      //e.emu_start(addr | 1, addr + code.length, 0, 0); // works fine
      //e.emu_start(addr | 1, addr + code.length, 0, 0x1ff); // works fine
      e.emu_start(addr | 1, addr + code.length, 0, 0x200); // jumps to 0x9fa2
    }
    catch(e) {
      console.log(e);
    }

    var r0 = e.reg_read_i32(uc.ARM_REG_R0);
    var pc = e.reg_read_i32(uc.ARM_REG_PC);
    console.log("PC = 0x" + pc.toString(16) + ", R0 = 0x" + r0.toString(16));
    </script>
  </body>
</html>

Let us know if you need more information in order to debug it. We did see other strange behaviour but this was the most reproducable.

AlexAltea commented 7 years ago

This is caused by issue #3. Unfortunately, for various reasons debugging Unicorn.js is really time consuming, so I haven't worked on solving this yet. Sorry for the trouble.

dpgeorge commented 7 years ago

@AlexAltea thanks for the reply. I'm also working on this issue with @flowergrass and in our original code we didn't have a limit on time or instructions, but still got the same error (jumping to a random but reproducible location).

I had a closer look at it and enabled SAFE_HEAP=1 in the Emscripten compiler. This makes the program abort very early on due to an un-aligned memory access in QEMU (Emscripten doesn't like unaligned access, but x86 does, so that explains why it works on the native machine but not in the browser). I managed to fix this memory access and then things seemed to work better. At least, the code posted above now works with the 0x200 limit, and also with other limits greater than this.

But running our original code with this fix leads to issues further on in the code, some assertions are failing in the core of QEMU.

AlexAltea commented 7 years ago

in our original code we didn't have a limit on time or instructions, but still got the same error (jumping to a random but reproducible location).

Interesting, I will investigate the issue this weekend. I thought all unaligned accesses where patched, but recent updates of the unicorn submodule might have broken something.

dpgeorge commented 7 years ago

I thought all unaligned accesses where patched, but recent updates of the unicorn submodule might have broken something.

We are using ARM arch in Thumb mode. The unaligned access is to do with QEMU/TCG generation of the exit_tb instruction, and then subsequent retrieval of this value (it does something like (uint64_t)byte_ptr).

AlexAltea commented 7 years ago

Sorry for the delay. It took a while to fix the TCG helpers. After running your test code above, I believe the issue is fixed. Could you please confirm it? The latest version is available at: https://github.com/AlexAltea/unicorn.js/releases/tag/v1.0

flowergrass commented 7 years ago

The patch was effective in fixing the isolated test case, but our code is still facing similar issues.

With the instruction limit parameter of emu_start set to 0, the MicroPython emulation supports basic operations (such as string manipulation eg. 'abc' + 'def'). However entering integers and arithmetic result in an error. This case is hosted at http://micropython.org/resources/mp_unicorn/index.html

When an instruction limit for emu_start is set, for example the 0x10000 used in http://micropython.org/resources/mp_unicorn/index2.html, an error code identical to the previous example is encountered. The error in this case occurs at the same program counter regardless of the instruction limit, whereas the previous error occurs at an entirely different program counter in each occurence.

AlexAltea commented 7 years ago

I will investigate and try to fix this issue over the weekend. Aside from the tcg_abort, there seems to be issues while flushing translation blocks.

flowergrass commented 7 years ago

How's it going @AlexAltea? Have you managed to confirm the issue? Hints would be greatly appreciated as we're looking at the problem ourselves. Cheers

AlexAltea commented 7 years ago

@flowergrass I've tried to debug the issue last week without success. At some point, a nullptr dereference occurs in tcg.c, but I can't really find the reason why this happens in the first place. I will need more time to fix it.

AlexAltea commented 7 years ago

Sorry for the terribly slow progress on this issue, these past weeks were quite hectic.

The issue is now "solved". It turns out that different objects where overlapping in memory due to https://github.com/kripken/emscripten/issues/4835. This caused all sorts of trouble and finding the root cause was kind of tricky. Updating the Emscripten SDK to >= 1.37.9 should solve the issue. Please confirm it when you have time. :-)

I have updated files at dist and the v1.0 release.

dpgeorge commented 7 years ago

Thanks very much @AlexAltea for putting your time into this bug, it's greatly appreciated. I agree that debugging this is really tricky as there are multiple layers of translation and emulation going on (the browser implementing JS, Emscripten half translating C to JS and half emulating the C runtime, QEMU translating simulated machine code to its internal tiny code, QEMU interpreting its tiny code, and then in our case we also had a MicroPython interpreter running on top of all that!). It looks like we were using Emscripten 1.37.5 so it would have had the bug with mmap not being aligned. We'll try out the fixed version and let you know how it goes.

flowergrass commented 7 years ago

Awesome work @AlexAltea. The issues have been fixed. Thank you!

dpgeorge commented 7 years ago

@AlexAltea FYI we are now using unicorn.js in the MicroPython unicorn port, see https://github.com/micropython/micropython-unicorn and https://micropython.org/unicorn

AlexAltea commented 7 years ago

@dpgeorge Great to see the MicroPython demo finally working. Looks really cool! Thanks for sharing the link! :)