jart / blink

tiniest x86-64-linux emulator
ISC License
6.95k stars 220 forks source link

Porting to webassembly #8

Open saolsen opened 1 year ago

saolsen commented 1 year ago

Hey,

I'm thinking about trying to get blink running in the browser (via webassembly). My goal isn't just to run c code in the browser (could just use emscripten for that) but to actually run ape x64 executables in an interpreter where I can build a debugger and some visualization tools to see what's happening. (Similar to what blink already does). I'm wondering if anybody has tried to compile blink for webassembly or what you think some of the challenges would be.

jart commented 1 year ago

Sounds awesome. I haven't tried it myself. I imagine the only major obstacle might be 32-bit. JavaScript only supports 32-bit integers. Blink runs on i386 and other 32-bit CPUs, and is regularly tested on them. However Blink was created in a 64-bit world, and as such, Blink has a 64-bit bias, therefore, you might not get optimal performance out of Blink in your 32-bit environments.

unicomp21 commented 1 year ago

This would be awesome!!!

unicomp21 commented 1 year ago

https://github.com/WebAssembly/tail-call/issues/15#issuecomment-1357820841

We're still waiting on tail calls for webassembly, lol. Wish there was another performant option.

unicomp21 commented 1 year ago

Someday we'll be able to use co_await in webassembly, hope I'm still around to see it happen, lol.

https://github.com/WebAssembly/tail-call/issues/14#issue-1058389687

unicomp21 commented 1 year ago

https://github.com/emscripten-core/emscripten/issues/10991#issuecomment-974226917

jart commented 1 year ago

Blink doesn't need tail calls.

Also, good news! Blink is now stable on 32-bit platforms. I don't know if webassembly is multi-core, but the threading issues Blink was having earlier on 32-bit have been resolved.

Your biggest obstacle is most likely going to be completely replacing everything in blink/syscall.c so it interfaces with WASI or something similar instead.

unicomp21 commented 1 year ago

https://github.com/copy/v86

unicomp21 commented 1 year ago

@gornishanov @tlively @madmongo1 @tomoliv30 any ideas on easiest way to implement linux syscall.c for webassembly?

https://github.com/jart/blink/issues/8#issuecomment-1370735621

https://github.com/WebKit/WebKit/pull/2065

unicomp21 commented 1 year ago

Dumb/crazy question, what sort of perf hit if blink runs nested within itself? I'm wondering if the outer vm could implement syscall.c for the inner vm? And handle threading etc. using co_await? Then outer vm could provide stupid simple interfaces for tunneling packets, etc.?

jart commented 1 year ago

It's possible to run Blink within Blink. There's noticeable slowdown, but it's not a showstopper. There's caveats, such as you can only have a single of the nested Blink instances make use of the linear memory optimization. The other nestings need to pass blink -m to turn it off, otherwise the memory allocations will collide.

Vogtinator commented 1 year ago

any ideas on easiest way to implement linux syscall.c for webassembly?

emscripten provides a surprisingly wide set of runtime APIs which might even be enough to "just work" already.

Vogtinator commented 1 year ago

Also, good news! Blink is now stable on 32-bit platforms. I don't know if webassembly is multi-core, but the threading issues Blink was having earlier on 32-bit have been resolved.

With web workers and SharedArrayBuffers it's possibly to effectively have threads and emscripten even exposes those through pthreads as much as possible.

any ideas on easiest way to implement linux syscall.c for webassembly?

emscripten provides a surprisingly wide set of runtime APIs which might even be enough to "just work" already.

I gave it a quick try and it does almost build out-of-the-box with just emmake make -j8. It just complains about sa_len and wait4 missing. With those worked around I get a blink.wasm of unknown quality, not tested.

jart commented 1 year ago

I got it to compile too. When I tried to run Blink in Node, this happened, and I have no idea what it means.

master jart@turfwar:~/blink$ node o//blink/blink
/home/jart/blink/o/blink/blink:2148
  function ___invoke_$struct_Machine*_$struct_System*_$struct_Machine*(
                                    ^

SyntaxError: Unexpected token '*'
    at wrapSafe (internal/modules/cjs/loader.js:915:16)
    at Module._compile (internal/modules/cjs/loader.js:963:27)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1027:10)
    at Module.load (internal/modules/cjs/loader.js:863:32)
    at Function.Module._load (internal/modules/cjs/loader.js:708:14)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:60:12)
    at internal/main/run_main_module.js:17:47

I configured the Makefile to build an HTML file and ran it in the browser. I got the same thing.

image

I'm going to push the fixes I made right now. Could you please help me figure out what's wrong?

Vogtinator commented 1 year ago

You might need newer LLVM or emscripten: https://github.com/emscripten-core/emscripten/issues/12551

I'm using emscripten main with clang 15.0.6 (should be 16 mmeanwhile, but it works :shrug:)

Vogtinator commented 1 year ago

Works!

/tmp/blink> node o/blink/blink -m /cwd/third_party/cosmo/tinyhello.elf
hello world

The WASM page size is 64KiB, which means that https://github.com/jart/blink/issues/14 happens, but by doing the awful hack of just pretending that it's actually 4096 it can even run busybox-static from the host here:

/tmp/blink> (cd /usr/bin; node /tmp/blink/o/blink/blink -m /cwd/busybox-static sh -c "echo Hello world!")
I2023-01-04T19:25:47.197000:blink/syscall.c:2610:42 missing syscall 0x111
I2023-01-04T19:25:47.198000:blink/syscall.c:2610:42 missing syscall 0x14e
warning: unsupported syscall: __syscall_prlimit64

I2023-01-04T19:25:47.200000:blink/syscall.c:1857:42 getrandom() flags not supported yet
Hello world!

It did require some hacks and workarounds though:

diff --git a/blink/blink.c b/blink/blink.c
index f6c0506..a241fd0 100644
--- a/blink/blink.c
+++ b/blink/blink.c
@@ -155,8 +155,10 @@ static void HandleSigs(void) {
   unassert(!sigaction(SIGSEGV, &sa, 0));
 #endif
 }
-
+#include <emscripten.h>
 int main(int argc, char *argv[], char **envp) {
+  EM_ASM({FS.mkdir('/cwd'); FS.mount(NODEFS, {root : '.'}, '/cwd');});
+  if(!envp) envp = environ;
   g_blink_path = argc > 0 ? argv[0] : 0;
   GetOpts(argc, argv);
   if (optind_ == argc) PrintUsage(argc, argv, 48, 2);
diff --git a/blink/debug.h b/blink/debug.h
index a996413..5e64aa9 100644
--- a/blink/debug.h
+++ b/blink/debug.h
@@ -10,7 +10,7 @@
 #define IB(x)                      \
   __extension__({                  \
     __typeof__(x) x_ = (x);        \
-    unassert((intptr_t)x_ > 4096); \
+    unassert((intptr_t)x_ > 0); \
     x_;                            \
   })
 #else
diff --git a/blink/memorymalloc.c b/blink/memorymalloc.c
index 3d0113b..b0b9266 100644
--- a/blink/memorymalloc.c
+++ b/blink/memorymalloc.c
@@ -64,6 +64,10 @@ void FreeBig(void *p, size_t n) {

 void *AllocateBig(size_t n) {
   void *p;
+#ifdef __EMSCRIPTEN__
+  p = Mmap(NULL, n, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0, "big");
+  return p != MAP_FAILED ? p : 0;
+#endif
   u8 *brk;
   if (!(brk = atomic_load_explicit(&g_allocator.brk, memory_order_relaxed))) {
     // we're going to politely ask the kernel for addresses starting

I built it with emmake make -j8 && emcc -g o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 -lnodefs.js

jart commented 1 year ago

Wow. I'm still catching up. Quick question. Did you have any problems with wait4? My build is complaining about that being undefined.

jart commented 1 year ago

I made a bunch more changes and I'm now blocked on this error.

master jart@turfwar:~/blink$ node o//blink/blink
requested a shared WebAssembly.Memory but the returned buffer is not a SharedArrayBuffer, indicating that while the browser has SharedArrayBuffer it does not have WebAssembly threads support - you may need to set a flag
(on node you may need: --experimental-wasm-threads --experimental-wasm-bulk-memory and/or recent version)
/home/jart/blink/o/blink/blink:163
      throw ex;
      ^

Error: bad memory
    at Object.<anonymous> (/home/jart/blink/o/blink/blink:820:13)
    at Module._compile (internal/modules/cjs/loader.js:1085:14)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
    at Module.load (internal/modules/cjs/loader.js:950:32)
    at Function.Module._load (internal/modules/cjs/loader.js:790:12)
    at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:76:12)
    at internal/main/run_main_module.js:17:47
$?=7 master jart@turfwar:~/blink$ type node
node is hashed (/home/jart/vendor/emsdk/node/14.18.2_64bit/bin/node)

I got a little further in the browser. Not sure what to do next.

image

Vogtinator commented 1 year ago

Wow. I'm still catching up. Quick question. Did you have any problems with wait4? My build is complaining about that being undefined.

Yeah, I had to stub that out. FWICT that's a bug in emscripten in some way, as it's available in the system headers and there's also a stub for __syscall_wait4

I made a bunch more changes and I'm now blocked on this error.

Might just be the version of node, it might not support SharedArrayBuffer backed memory for WASM. I'm using v19.3.0 here.

I got a little further in the browser. Not sure what to do next.

That's where the "fun" part starts: Somehow making use of blink.js in the web page. Here's a minimal PoC:

pre.js:

function fileLoad(event, filename) {
    var file = event.target.files[0];
    var reader = new FileReader();
    reader.onloadend = function(event) {
      if(event.target.readyState == FileReader.DONE)
        FS.writeFile(filename, new Uint8Array(event.target.result), {encoding: 'binary'});
    };
    reader.readAsArrayBuffer(file);
}

let fileInput = document.createElement("input");
fileInput.setAttribute("type", "file");
fileInput.onchange = () => { fileLoad(event, "executable"); };
document.body.appendChild(fileInput);

let button = document.createElement("button");
button.innerText = "Start";
button.onclick = () => { Module.callMain(["executable"]); };
document.body.appendChild(button);

Build with emcc o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink.html -s INVOKE_RUN=0 -s EXPORTED_RUNTIME_METHODS=callMain -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 --emrun --pre-js pre.js

At some point the best option is probably to expose some kind of API to JS, depending on what the actual use cases are.

unicomp21 commented 1 year ago

@Vogtinator @jart you guys are amazing. This is great!

ghost commented 1 year ago

Well done guys! Very Cool!

Rucadi commented 1 year ago

This is awesome, WILL hack with this :D

pannous commented 1 year ago

please someone host their compiled blink.wasm !

ghost commented 1 year ago

please someone host their compiled blink.wasm !

Hmmmmm........

Vogtinator commented 1 year ago

I pushed a github workflow for emscripten HTML builds: https://github.com/Vogtinator/blink/actions/workflows/emscripten.yml

To build for node instead, uncomment the NODEFS mounting in blink/web.h and build with emcc -O2 o//blink/blink.o o//blink/blink.a -lm -pthread -lrt -o o//blink/blink -s INITIAL_MEMORY=1073741824 -s EXIT_RUNTIME=1 -lnodefs.js.

unicomp21 commented 1 year ago

@derekcollison I'm wondering if there could be implications/synergy here for nats? ie tunneling tcp syscalls, etc.?

derekcollison commented 1 year ago

@derekcollison I'm wondering if there could be implications/synergy here for nats? ie tunneling tcp syscalls, etc.?

How so? Maybe running a nats-server in the browser?

unicomp21 commented 1 year ago

Yes, or the networking layer for many vm's running in browsers?

trungnt2910 commented 1 year ago

please someone host their compiled blink.wasm !

You might want to try this, based on Vogtinator's workflow.

jart commented 1 year ago

Tweeted https://mobile.twitter.com/JustineTunney/status/1613895681038770182

@trungnt2910 @Vogtinator Would you both be interested in upstreaming your work? I've just added support for GitHub Actions today. We could add a WASM workflow for example.

One thing I'd especially like to see, is some kind of ANSI code support, so we can render the Blinkenlights TUI in the browser so that people don't need to run it locally to use it.

Vogtinator commented 1 year ago

One thing I'd especially like to see, is some kind of ANSI code support, so we can render the Blinkenlights TUI in the browser so that people don't need to run it locally to use it.

Getting that to work might not be trivial. emscripten can't do any long-running work on the main thread as it blocks the browser, so waiting for input in native code just does not work. Either the TUI would have to be rewritten to work async or it has to run in a web worker and somehow communicate with the main thread for IO.

trungnt2910 commented 1 year ago

@jart

@trungnt2910 @Vogtinator Would you both be interested in upstreaming your work? I've just added support for GitHub Actions today. We could add a WASM workflow for example.

Before that, there's a problem: After running any binary, the web page must be reloaded in order to run it another time.

See this commit for my workaround. The problem here is, in memorymalloc.c, it leaks memory:

void FreeBig(void *p, size_t n) {
  if (!p) return;
#if !defined(__EMSCRIPTEN__)
  unassert(!munmap(p, GetBigSize(n)));
#endif
}

If you enable all logging, you can see that it comes from Exec, then FreeMachine, FreeSystem, and reaches FreeBig here. The block of memory that it tries to munmap is perfectly valid, but for some reasons the Emscripten runtime always returns EINVAL.

jart commented 1 year ago

Getting that to work might not be trivial. emscripten can't do any long-running work on the main thread as it blocks the browser, so waiting for input in native code just does not work. Either the TUI would have to be rewritten to work async or it has to run in a web worker and somehow communicate with the main thread for IO.

Can it just call setcontext() or something on entry to read(0, ...) to save the CPU state and wait for the JavaScript code to resume it with the result of a keystroke?

trungnt2910 commented 1 year ago

Hmm, I'm testing a simple Linux "Hello World" x86_64 assembly (the plain old statically ELF file, not any of the special formats here) on the live page, and it always fails:

I2023-01-13T21:52:53.783000:blink/throw.c:91:42 SEGMENTATION FAULT AT ADDRESS 0
     PC 401000 add %al,(%rax)
     AX 0000000000000000  CX 0000000000000000  DX 0000000000000000  BX 0000000000000000
     SP 00000000f7fffe70  BP 0000000000000000  SI 0000000000000000  DI 0000000000000000
     R8 0000000000000000  R9 0000000000000000 R10 0000000000000000 R11 0000000000000000
    R12 0000000000000000 R13 0000000000000000 R14 0000000000000000 R15 0000000000000000
     FS 0000000000000000  GS 0000000000000000 OPS 3                JIT 0               
    executable
    000000000000 000000401000 _start
I2023-01-13T21:52:53.817000:blink/blink.c:66:42 terminating due to signal SIGSEGV
Aborted(native code called abort())
turbolent commented 1 year ago

Would it be possible to get a WASI or non-Emscripten build, i.e. one that does not require a JS runtime?

jart commented 1 year ago

@trungnt2910 could you post a comment with your hello world binary attached? The one we're using is third_party/cosmo/tinyhello.elf

trungnt2910 commented 1 year ago

@jart

@trungnt2910 could you post a comment with your hello world binary attached? The one we're using is third_party/cosmo/tinyhello.elf

https://github.com/trungnt2910/HelloElf/blob/master/HelloElf/hello

Here, this one simply writes "Hello World" through a raw write syscall and exits immediately.

trungnt2910 commented 1 year ago

The binary I mentioned above is compiled using nasm on Linux, following the guide here.

trungnt2910 commented 1 year ago

Oops, I sent the wrong binary, IIRC the one mentioned above makes an additional getpid call. This one is simpler but still triggers the same issue.

hello.zip

jart commented 1 year ago

That hello binary runs fine under o//blink/blink locally. It appears to only be an issue under wasm.

For starters, we really should say g_high.enabled = false on the WASM build, so it doesn't emit those ANSI codes.

As for the error, it appears ThrowSegmentationFault() is being called when it executes this opcode, which is the very first instruction in your program.

  401000:       b8 01 00 00 00          mov    $0x1,%eax

It should not be possible for that instruction to segfault. The opcode b8 runs OpMovZvqpIvqp which doesn't have any code paths that would lead it to throw a segfault. It's possible there's something in GeneralDispatch that's throwing it. Do you know if it's possible to have WASM print a backtrace of Blink's code?

jart commented 1 year ago

Do we know if Blink's JIT is working under WASM? Someone on Hacker News is asking about it. https://news.ycombinator.com/item?id=34368513 We haven't explicitly disabled JIT for Emscripten, and it's able to run programs that have loops, such as third_party/cosmo/loopy.elf until they exit(). So I'd assume it is in fact running JIT'd code under WASM, even though this blog post (https://wingolog.org/archives/2022/08/18/just-in-time-code-generation-within-webassembly) seems to claim that isn't possible because WASM is supposedly a Harvard Architecture. Could it really be the case that blink/jit.c was designed so perfectly that it just works? Or does WASM impose a W^X invariant that's causing the JIT to kick the can down the road until a full block can be mprotect()'d, at which point it's doomed to fail?

trungnt2910 commented 1 year ago

Do you know if it's possible to have WASM print a backtrace of Blink's code?

I can see you have a PrintBacktrace function, should I insert this into the ThrowSegmentationFault to find out?

tkchia commented 1 year ago

Hello @jart,

It should not be possible for that instruction to segfault. The opcode b8 runs OpMovZvqpIvqp which doesn't have any code paths that would lead it to throw a segfault. It's possible there's something in GeneralDispatch that's throwing it.

I see that the error message dump provided by @trungnt2910 lists the instruction as add %al,(%rax), for some reason, rather than the correct mov $0x1,%eax. Thank you!

jart commented 1 year ago

I can see you have a PrintBacktrace function, should I insert this into the ThrowSegmentationFault to find out?

That's unlikely to work. PrintBacktrace only works on very few platforms, e.g. asan, ubsan, and darwin's version of libunwind. The Emscripten platform would need to provide an API for doing backtraces that we'd integrate with.

tlively commented 1 year ago

Do we know if Blink's JIT is working under WASM? Someone on Hacker News is asking about it.

Wasm functions and instructions exist entirely separately from Wasm's addressable memory, so there's no way for the JIT to be working unless you're using the workarounds described in Wingo's blog. That said, I have no idea how Blink's JIT works, so maybe it's gracefully falling back to something that still works under Wasm?

trungnt2910 commented 1 year ago

Could it really be the case that blink/jit.c was designed so perfectly that it just works?

I doubt it, even .NET's JIT doesn't work under WASM. And your code doesn't seem to have anything related to generating and importing new WASM modules, so the method in that blog post doesn't apply here either.

Or does WASM impose a W^X invariant that's causing the JIT to kick the can down the road until a full block can be mprotect()'d, at which point it's doomed to fail?

Theoretically, yes, as mentioned on the above comment: "Wasm functions and instructions exist entirely separately from Wasm's addressable memory".

But the curious thing is, Emscripten stubs the mprotect call and returns a success status code! So blink is supposed to be tricked into believing that the JIT succeeded. I don't know how it magically proceeds after that.

jart commented 1 year ago

In that case your PR should update https://github.com/jart/blink/blob/master/blink/builtin.h#L223 to add !defined(__EMSCRIPTEN__).

trungnt2910 commented 1 year ago

In that case your PR should update https://github.com/jart/blink/blob/master/blink/builtin.h#L223 to add !defined(__EMSCRIPTEN__).

I don't think so. Now that I see your file, JIT support instead of being blacklisted is actually whitelisted only for x86_64 and ARM64. WASM, a separate instruction set, is not whitelisted, and JIT is therefore disabled.

jart commented 1 year ago

Oh that makes sense! Thank you.

fc59283 commented 1 year ago

Is it possible to do it with the Zig compiler? Can Cosmopolitan LibC (or other jart projects) be called from Zig code?

Vogtinator commented 1 year ago

Getting that to work might not be trivial. emscripten can't do any long-running work on the main thread as it blocks the browser, so waiting for input in native code just does not work. Either the TUI would have to be rewritten to work async or it has to run in a web worker and somehow communicate with the main thread for IO.

Can it just call setcontext() or something on entry to read(0, ...) to save the CPU state and wait for the JavaScript code to resume it with the result of a keystroke?

Yes, that's possible: https://emscripten.org/docs/porting/asyncify.html

That hello binary runs fine under o//blink/blink locally. It appears to only be an issue under wasm.

For starters, we really should say g_high.enabled = false on the WASM build, so it doesn't emit those ANSI codes.

As for the error, it appears ThrowSegmentationFault() is being called when it executes this opcode, which is the very first instruction in your program.

  401000:       b8 01 00 00 00          mov    $0x1,%eax

It should not be possible for that instruction to segfault. The opcode b8 runs OpMovZvqpIvqp which doesn't have any code paths that would lead it to throw a segfault. It's possible there's something in GeneralDispatch that's throwing it. Do you know if it's possible to have WASM print a backtrace of Blink's code?

I had a quick look. I can reproduce the issue with the hello binary here. My guess was that it's an issue with the ELF loader. I enabled logging and it looks like indeed just forgets to load data from the file.

I2023-01-13T22:08:28.980000:blink/map.c:50:0 (mem) big created 16m map [0xe0000,0x10e1000)
I2023-01-13T22:08:28.980000:blink/memorymalloc.c:247:0 (thr) new machine thread pid=42 tid=42
I2023-01-13T22:08:28.981000:blink/map.c:50:42 (mem) loader created 8872 map [0x10f0000,0x10f22a8)
I2023-01-13T22:08:28.981000:blink/map.c:50:42 (mem) big created 256k map [0x1110000,0x1150000)
I2023-01-13T22:08:28.982000:blink/loader.c:61:42 (elf) PROGRAM HEADER
I2023-01-13T22:08:28.982000:blink/loader.c:62:42 (elf)   vaddr = 400000
I2023-01-13T22:08:28.982000:blink/loader.c:63:42 (elf)   memsz = e8
I2023-01-13T22:08:28.982000:blink/loader.c:64:42 (elf)   offset = 0
I2023-01-13T22:08:28.982000:blink/loader.c:65:42 (elf)   filesz = e8
I2023-01-13T22:08:28.982000:blink/loader.c:66:42 (elf)   pagesize = 10000
I2023-01-13T22:08:28.982000:blink/loader.c:67:42 (elf)   start = 400000
I2023-01-13T22:08:28.982000:blink/loader.c:68:42 (elf)   end = 410000
I2023-01-13T22:08:28.982000:blink/loader.c:69:42 (elf)   skew = 0
I2023-01-13T22:08:28.982000:blink/loader.c:156:42 (elf) alloc 400000-410000
I2023-01-13T22:08:28.983000:blink/memorymalloc.c:458:42 (mem) reserving virtual [0x400000,0x410000) w/ 64 kb
I2023-01-13T22:08:28.983000:blink/loader.c:164:42 (elf) copy 400000-4000e8 from 0-e8
I2023-01-13T22:08:28.983000:blink/loader.c:61:42 (elf) PROGRAM HEADER
I2023-01-13T22:08:28.983000:blink/loader.c:62:42 (elf)   vaddr = 401000
I2023-01-13T22:08:28.983000:blink/loader.c:63:42 (elf)   memsz = 27
I2023-01-13T22:08:28.983000:blink/loader.c:64:42 (elf)   offset = 1000
I2023-01-13T22:08:28.983000:blink/loader.c:65:42 (elf)   filesz = 27
I2023-01-13T22:08:28.983000:blink/loader.c:66:42 (elf)   pagesize = 10000
I2023-01-13T22:08:28.983000:blink/loader.c:67:42 (elf)   start = 400000
I2023-01-13T22:08:28.983000:blink/loader.c:68:42 (elf)   end = 410000
I2023-01-13T22:08:28.983000:blink/loader.c:69:42 (elf)   skew = 1000
I2023-01-13T22:08:28.983000:blink/loader.c:61:42 (elf) PROGRAM HEADER
I2023-01-13T22:08:28.983000:blink/loader.c:62:42 (elf)   vaddr = 402000
I2023-01-13T22:08:28.983000:blink/loader.c:63:42 (elf)   memsz = e
I2023-01-13T22:08:28.983000:blink/loader.c:64:42 (elf)   offset = 2000
I2023-01-13T22:08:28.983000:blink/loader.c:65:42 (elf)   filesz = e
I2023-01-13T22:08:28.983000:blink/loader.c:66:42 (elf)   pagesize = 10000
I2023-01-13T22:08:28.983000:blink/loader.c:67:42 (elf)   start = 400000
I2023-01-13T22:08:28.983000:blink/loader.c:68:42 (elf)   end = 410000
I2023-01-13T22:08:28.983000:blink/loader.c:69:42 (elf)   skew = 2000
I2023-01-13T22:08:28.983000:blink/memorymalloc.c:458:42 (mem) reserving virtual [0xf7800000,0xf8000000) w/ 8192 kb
I2023-01-13T22:08:28.986000:blink/machine.c:2176:42 (asm) decoding [add %al,(%rax)] at address 401000
I2023-01-13T22:08:28.987000:blink/throw.c:91:42 SEGMENTATION FAULT AT ADDRESS 0
         PC 401000 add %al,(%rax)

This is probably due to the 64KiB page size in WASM. FWICT "pages" in wasm only refer to the granularity the memory buffer can be grown at but does not really influence the code running inside. So we can just pretend that it's 4KiB as a workaround, and we get hello world!

diff --git a/blink/map.c b/blink/map.c
index 8fdbc45..0171b6a 100644
--- a/blink/map.c
+++ b/blink/map.c
@@ -28,6 +28,10 @@
 #include "blink/util.h"

 long GetSystemPageSize(void) {
+#ifdef __EMSCRIPTEN__
+  return 4096;
+#endif
+
   long z;
   unassert((z = sysconf(_SC_PAGESIZE)) > 0);
   unassert(IS2POW(z));

I can see you have a PrintBacktrace function, should I insert this into the ThrowSegmentationFault to find out?

That's unlikely to work. PrintBacktrace only works on very few platforms, e.g. asan, ubsan, and darwin's version of libunwind. The Emscripten platform would need to provide an API for doing backtraces that we'd integrate with.

You can #include <emscripten.h> and use EM_ASM(console.trace()); to print a backtrace.