jprendes / emception

Run Emscripten in the browser
Other
285 stars 35 forks source link

Request: Provide Release of Prebuilt Wasm Binaries #12

Closed CounterMatter closed 1 year ago

CounterMatter commented 1 year ago

Hi,

I found your very cool Emscripten in a browser demo. Providing a release that includes the prebuilt, unbundled .wasm binaries and their corresponding .js scripts would make it convenient for other people to use them in their own projects. Specifically, clang.wasm and lld.wasm would be useful.

Thanks.

CounterMatter commented 1 year ago

I'm documenting what I discovered for anyone who is in a similar boat. Sorry for the long post. I managed to build these on Windows (with a few tweaks, see below). I also created an index.html (inside extras.zip) to demo how to use them. These would be useful for people who are crazy like me and want lower level control over creating a wasm binary in a browser. The wasm-transform tool would probably be useful for reducing the total file size of clang.wasm and lld.wasm, but I didn't know how to use it. Also, the SHAREDFS.json is probably better than what I did. I just kept it as simple as possible (see below).

clang.js.gz clang.wasm.gz lld.js.gz lld.wasm.gz extras.zip

Sorry I didn't make any scripts. Here's the steps I used to build these on Windows.

  1. Install all prerequisites

I had to install python 3, cmake, ninja, llvm (I think for building the 'native' part of llvm later), and emscripten. It's easiest if they are in the path.

  1. Clone llvm-project

git clone https://github.com/llvm/llvm-project.git

I used commit a64846bee0bb4b4912c8cf6bf018ba5d892065d1, but it seems pretty stable.

  1. Patch it

    • Delete the 'if (!inProcess)' branch in clang/lib/Driver/Job.cpp like the patch in this repo does. If you don't you'll probably get errors about not being able to execute the process. I think it would be possible to allow clang.wasm to call itself and lld.wasm, but I didn't do that.
    • Replace wait4 with __syscall_wait4. The easiest way I found to do this was to add 'set(CMAKE_CXX_FLAGS "-Dwait4=__syscall_wait4" CACHE INTERNAL "")' to the top of llvm/CMakeLists.txt. I'm not sure how build-llvm.sh does it.
    • Cmake gave an error about "unable to determine host target triple". I tried passing it into the cmake call, but the most reliable way to make it work was to just overwrite that message line in llvm/cmake/modules/GetHostTriple.cmake with 'set(value "x86_64")'.
    • Add some linker flags. The easiest way I found was to add this after the add_executable() call in llvm/cmake/modules/AddLLVM.cmake (more on the post.js and post2.js, which are in extras.zip, later):
      if(EMSCRIPTEN AND "${name}" STREQUAL "clang")
      set_target_properties("clang" PROPERTIES LINK_FLAGS "--post-js=../post.js --closure 1 -s MODULARIZE -s ALLOW_MEMORY_GROWTH -s EXPORT_NAME=createClang -s EXPORTED_RUNTIME_METHODS=[build]")
      endif()
      if(EMSCRIPTEN AND "${name}" STREQUAL "lld")
      set_target_properties("lld" PROPERTIES LINK_FLAGS "--post-js=../post2.js --closure 1 -s MODULARIZE -s ALLOW_MEMORY_GROWTH -s EXPORT_NAME=createLld -s EXPORTED_RUNTIME_METHODS=[link]")
      endif()
  2. Build it

emcmake cmake -S llvm -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=WebAssembly -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_DUMP=OFF -DLLVM_ENABLE_ASSERTIONS=OFF -DLLVM_ENABLE_EXPENSIVE_CHECKS=OFF -DLLVM_ENABLE_BACKTRACES=OFF -DLLVM_BUILD_TOOLS=OFF -DLLVM_ENABLE_THREADS=OFF -DLLVM_BUILD_LLVM_DYLIB=OFF -DLLVM_INCLUDE_TESTS=OFF cmake --build build

The output files are in build/bin.

  1. Extra credit

Minify the javascript (for some reason, emscripten outputs a few new lines), gz compress the javascript and wasm files (if you do this, make sure to update the javascript first to change clang.wasm to clang.wasm.gz), etc.

The post.js and post2.js files each contain a helper function to generically call clang.wasm and lld.wasm respectively. I had to include it with the build due to the way the closure compiler works. I treat 'commandLine' and 'standardInput' like optional input "files" and 'standardOutput' and 'standardError' like output "files". Also, you can provide arbitrary "files" as inputs and request arbitrary "files" as outputs using their full path. The "files" are really just strings or Uint8Arrays depending on if they are text or binary. Hopefully it is clear from the index.html example. The user is responsible for keeping the intermediate files and transferring them between wasm calls, as the virtual filesystem that emscripten makes is recreated from scratch each time the wasms run.