pharo-project / pharo-vm

This is the VM used by Pharo
http://pharo.org
Other
115 stars 72 forks source link

WASM support #577

Open pavel-krivanek opened 1 year ago

pavel-krivanek commented 1 year ago

I was playing with building for Emscripten few minutes again and this is what I found.

For a reminder, there is the old attempt named EmPharo https://github.com/pavel-krivanek/EmPharo/tree/master/src In that, I was exploring if it is possible at all to compile the VM on Emscripten. At that time, there were some platform limitations like missing atomic operations, heartbeat and long jumps. But with some ugly hacking that bypassed that, I was at least able to compile the VM. It was not actually able to run the image. One of the issues of EmPharo is that it extracted and modified sources from the image, so it was almost impossible to keep it in sync with vanilla VM, which started to be a quickly moving target.

Now I tried to check if we can build on it again. At least I tried to build it on top of the current sources tree. It is not that hard to start. Just install Emscripten SDK, see https://emscripten.org/docs/getting_started/downloads.html (and related)

I created a branch https://github.com/pavel-krivanek/pharo-vm/tree/EmPharo-experimental that only adds the build script to the root folder and a config file. https://github.com/pavel-krivanek/pharo-vm/blob/EmPharo-experimental/build-em.sh

It first requires building the StackVM on Linux the standard way, which generates the required sources: cmake .. -DAPPNAME=Pharo -DVM_EXECUTABLE_NAME=pharo -DFLAVOUR=StackVM -DPHARO_DEPENDENCIES_PREFER_DOWNLOAD_BINARIES=TRUE

Then I tried to compile files listed in the original EmPharo. Most of them compile without issues, some (I commented them out) have some troubles, like the mentioned missing atomics support. This may be solvable.

But the compilation of the interpreter is another story. Clang detects a lot of type conversion issues which should probably be better handled in the generated sources. But it can be skipped by -Wno-int-conversion option. The worse thing is that I hit this new error.

fatal error: error in backend: data symbols must have a size set with .size: LbytecodeDispatch
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /home/krivanek/emscripten/emsdk/upstream/bin/clang -target wasm32-unknown-emscripten -fignore-exceptions -fvisibility=default -mllvm -combiner-global-alias-analysis=false -mllvm -enable-emscripten-sjlj -mllvm -disable-lsr -D__EMSCRIPTEN_SHARED_MEMORY__=1 -DEMSCRIPTEN -Werror=implicit-function-declaration --sysroot=/home/krivanek/emscripten/emsdk/upstream/emscripten/cache/sysroot -Xclang -iwithsysroot/include/fakesdl -Xclang -iwithsysroot/include/compat -DEMCC=1 -DLSB_FIRST=1 -DDEBUG=1 -I./em -I./src -I./include -I./build/generated/64/vm/include -I./extracted/vm/include/common -I./extracted/vm/include/unix -I./include/pharovm -c -w -pthread -Wno-int-conversion -matomics -mbulk-memory ./build/generated/64/vm/src/gcc3x-interp.c -o ./build-em/gcc3x-interp.o
1

Only compiling of gcc3x-interp.c reports that and I have no idea how to proceed.

So, anyone interested, please, try to reproduce the building on your own. Maybe it is just an issue of my setup. Then we can try to investigate more.

guillep commented 1 year ago

What if you disable GNUISATION? https://github.com/pharo-project/pharo-vm/blob/dc9362d37672db1d9f0f1bc2f736330d28473dfb/cmake/vmmaker.cmake#L27

Maybe LLVM's WASM backend does not like GNU extensions

pavel-krivanek commented 1 year ago

Same results for me.

pavel-krivanek commented 1 year ago

I can compile the interpreter if I remove the interpreter switch from it. Which means we can maybe try to locate the problematic construction that is causing it. Maybe some goto or so...

pavel-krivanek commented 1 year ago

I found the problem. The issue is definition of VM_LABEL, if I defined it in interp.c as #define VM_LABEL(foo) ((void)0), then the interpreter compiles without other issues.

pavel-krivanek commented 1 year ago

I was able to, at least, build it now. With very little modifications. It is able to start the VM (without image). When built with pthread, it requires special serving because of security restrictions.

image

pavel-krivanek commented 1 year ago

This is how it fails now when trying to run an image.

image

The generated function setMaxOldSpaceSize is compiled as unreachable code but the other module (client.c) is anyway trying to call it.

pavel-krivanek commented 1 year ago

Status update

With the previous problem solved, it currently fails on the heap allocation

[ERROR] 2023-04-14 11:08:36.000 error (./src/debug.c:45):Failed to allocate memory for the heap
Paro.js:791** Uncaught RuntimeError: Aborted(native code called abort())
    at abort (pharo.js:791:10)
    at _abort (pharo.js:4626:2)
    at error (pharo.wasm:0x23b3)
    at allocateHeap (pharo.wasm:0xab09b)
    at readImageNamed (pharo.wasm:0xa8371)
    at loadPharoImage (pharo.wasm:0x17f6)
    at vm_init (pharo.wasm:0x16f5)
    at runVMThread (pharo.wasm:0x1f3a)
    at runOnMainThread (pharo.wasm:0x1ec8)
    at vm_main_with_parameters (pharo.wasm:0x1e2b)
pavel-krivanek commented 1 year ago

If you want to play with it, I recommend to install into Chrome the "C/C++ DevTools Support (DWARF)". It is not enough to install it, it must be enabled in Experiments, see: https://developer.chrome.com/blog/wasm-debugging-2020/

Then, you get an usable debugging environment.

image

Chrome Developers
Debugging WebAssembly with modern tools - Chrome Developers
Step-by-step overview of the new debugging experience for WebAssembly in Chrome DevTools.
pavel-krivanek commented 1 year ago

The problem is that the memory allocation using mmap the VM uses is not WASM friendly. The following code (extracted use case) is working in native code but not in WASM (returns MAP_FAILED)

#include <stdio.h>
#include <sys/mman.h>

#define MAP_PROT(PROT_READ | PROT_WRITE)
#define MAP_FLAGS(MAP_ANON | MAP_PRIVATE)

int main(void)
{
    void *heap;
    unsigned int desiredBaseAddressAligned = 0x20000000;
    unsigned int heapLimit = 4456448;

    printf("Allocating...\n");

    printf("MAP_FAILED: %u\n", (unsigned int) MAP_FAILED);

    heap = mmap((void*) desiredBaseAddressAligned, heapLimit, MAP_PROT, MAP_FLAGS, -1, 0);
    printf("Result:     %u\n", (unsigned int) heap);

    printf("Finished\n");

    return 0;
}
syrel commented 1 year ago

Something along these lines might work: (taken from https://github.com/maenu/opensmalltalk-vm/tree/wasm-1)

#ifdef EMSCRIPTEN
    char *allocSpacer;
#endif

    address = (char *)roundUpToPage((unsigned long)minAddress);
    bytes = roundUpToPage(size);
    delta = max(pageSize,1024*1024);

#ifdef EMSCRIPTEN
    // take it or leave it
    alloc = mmap(0, bytes, PROT_READ | PROT_WRITE /*| PROT_EXEC*/,
                 MAP_ANON | MAP_PRIVATE, -1, 0);
    if (alloc == MAP_FAILED) {
        mmapErrno = errno;
        return 0;
    }
    *allocatedSizePointer = bytes;
    return alloc;
#else
GitHub
GitHub - maenu/opensmalltalk-vm at wasm-1
Cross-platform virtual machine for Squeak, Pharo, Cuis, and Newspeak. - GitHub - maenu/opensmalltalk-vm at wasm-1
pavel-krivanek commented 1 year ago

if the desiredBaseAddressAligned is 0 as in the Squeak, the allocation passes. The question is, how much the VM design relies on it. @guillep?