crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.46k stars 1.62k forks source link

Roadmap to WebAssembly support #12002

Open lbguilherme opened 2 years ago

lbguilherme commented 2 years ago

This is intended as an umbrella issue to organize high-level WebAssembly progress and goals.

Why?

WebAssembly is a new standard for a compilation target that is quickly growing in popularity, not only on the Web. It offers portability to run anywhere with near-native speed (web browsers, cloud servers, embedded devices, plugins, blockchains, etc), it allows different languages to interoperate in a convenient format, and it is secure and verifiable before execution. Startup time is also faster than a Docker container. For more details please refer to this excellent article: Pay attention to WebAssembly by Harshal Sheth.

How?

I believe Crystal should aim to support WebAssembly as a first-class platform both for writing complete applications and for writing plugins for existing applications. By complete application, I mean a Crystal app that interacts primarily by IO operations with sockets or files and has a clear start and finalization lifetime, like an ordinary process. This would primarily happen with the WASI library interface. By plugin, I mean a Crystal module that imports and exports some functions that can be called by the loader application, interacting with it. Different from a shared library on native targets (that Crystal isn't good at targetting because the GC and stdlib like to have full control over the process' IO and memory), a WebAssembly module is isolated from the application that loads it, with independent memory and system operations.

Some languages that target WebAssembly very well are:

Many other languages (like Python, Java, or .NET) support WebAssembly as well, some through interpreters. Here is an up-to-date list: https://github.com/appcypher/awesome-wasm-langs.

There are three common targets for WebAssembly:

Additionally, Crystal's stdlib depends on some libraries to be able to fully run: libc, libevent, libgc, libpcre, libgmp, libxml, etc. Those need to be compiled to WebAssembly and then linked to the Crystal app during the build process. It is still unclear if all of them support WebAssembly.

What are the challenges ahead?

The first step was released with Crystal 1.4, a few days ago: targeting WebAssembly and offering a WASI target with a subset of the stdlib working. It is still experimental and needs work to be finalized. See https://github.com/crystal-lang/crystal/pull/10870.

Fibers and concurrency

In WebAssembly the stack of protected and cannot be read or manipulated in any way. In fact, LLVM creates a shadow stack in the main memory to store stack variables whose pointer is taken at some point. For concurrency with Fibers, channels, and non-blocking IO, Crystal needs to keep multiple stacks and switch between them. This simply isn't possible in plain WebAssembly.

Here we can take advantage of the Bynarien Asyncify code transformation pass. It does static code transformations on a compiled WebAssembly file to allow the stack to be manipulated and swapped, at the cost of some performance and code size. The comments on top of the source file have some nice explanations on how it works.

The Fiber.swapcontext method would be implemented by storing the next Fiber in a global variable and then begin the unwinding process with Asyncify. The program entry point (fun main) would need to catch this unwinding, stop it, and then start the rewinding of the next Fiber (stored on that global variable). Finally, Fiber.swapcontext of the new Fiber would catch and stop the rewinding process.

The downside is that every exported function will need to be wrapped in some kind of Scheduler.enable do ... end block to allow Fiber swaps to happen inside it.

Garbage Collector

LibGC (bdwgc) can be compiled into WebAssembly, but it requires inspecting the stack to work properly. As we saw before, the stack can't be inspected or manipulated directly. Asyncify's unwind and rewind operations work by storing every stack variable in a memory buffer. Thus all we need to do is to unwind the current Fiber whenever the GC needs to run and then rewind the same Fiber back again. We again need to ensure every exported function is wrapped in a block to handle the unwinding and rewinding of Fibers.

There is a proposal to implement a WebAssembly native GC in the works, but I don't expect it to be supported everywhere anytime soon.

Also, a hard-coded maximum memory size needs to be defined, as there is no way to figure out the maximum memory available.

Exceptions

There is an ongoing proposal for WebAssembly Exceptions. It is already supported by the Chromium-based browsers, by Firefox behind a flag in the nightly version and by LLVM's codegen. Unfortunately, it is still not widely used or well documented.

We can:

  1. Exit on exceptions without the ability to catch them (current behavior). Not ideal.
  2. Implement the native wasm exception targeting behind a flag, but the final module will only run on runtimes that support it. Ideal for the future.
  3. Implement exceptions on top of Asyncify again. begin ... rescue can be implemented by unwinding and rewinding the stack, and keeping a copy of the memory buffer, and raising can be implemented by unwinding, discarding, and rewinding into the previously saved state, similar to how setjmp / longjmp work. This is very taxing on performance but works everywhere.

CallStack

We might be able to raise and rescue from exceptions, but we are still unable to obtain a nice stack trace from it as the stack can't be inspected. Currently, the only way to support this is by invoking JavaScript.

Threads

WebAssembly modules can run in multiple threads and support shared memory and atomic operations. But WASI doesn't provide a way to start a thread. Creating threads can only be performed from JavaScript currently.

Signals and Processes

Those aren't supported and don't make much sense with WebAssembly. They can be somewhat emulated with JavaScript if this other process is also a wasm module.

EventLoop

It is likely that libevent2 can be compiled to WASI since wasi-libc implements the poll function. But if that's not the case, then the event loop can be implemented on top of WASI's poll_oneoff function. It supports subscribing for clock events or for a file descriptor to be ready for reads/writes.

Ecosystem interop

Existing shards written in pure Crystal should work unchanged unless they depend on some unimplemented part of the stdlib. Shards that depend on native libraries should work as long as the underlying library can be compiled to WebAssembly as well. Some new shards bringing interoperability with other languages will likely arise, those using WebAssembly-specific functionality.

Run the standard library spec as a CI step

Spec can already run (depending on mocking Fiber.yield) and it should shed a light on what parts of the standard library work and what parts won't. Running it on the CI will help with preventing regressions on top of what already works, after WebAssembly graduates as a supported target.

Fryguy commented 2 years ago

Typo?

-In WebAssembly the stack of protected and can be read or manipulated in any way.
+In WebAssembly the stack is protected and cannot be read or manipulated in any way.
lbguilherme commented 2 years ago

Fixed 😅

zw963 commented 2 years ago

Add the link to form WASM discuss here for convenience.

https://forum.crystal-lang.org/t/trying-out-wasm-support/4508/1

straight-shoota commented 1 year ago

WASIX is a newly announced superset of WASI with some additional features that we've been missing (e.g. threads and async IO): https://wasix.org/ Will need to investigate how we could utilize it in Crystal.

maxfierke commented 1 year ago

As exciting as WASIX seems (and it looks really cool!), it does appear to be a set of vendor extensions rather than a true standard, so it might be something that would be best explored as a shard rather than within the compiler or stdlib itself (at least until parts of it become part of a WASI preview or something, or it it's implemented in runtimes other than Wasmer)

Relatedly, I noticed yesterday that Mitchell Hashimoto has started work on an event-loop library that includes support for WASI: https://github.com/mitchellh/libxev