njsmith / posy

287 stars 17 forks source link

WASM support? #17

Open njsmith opened 1 year ago

njsmith commented 1 year ago

This is more of a long-term question, but WASM-cpython is becoming more of a thing, and it's interesting to think about how it would work to support it.

CC @ethanhs @brettcannon @tiran in case you're interested, feel free to hit unsubscribe :-)

Here's a cheatsheet on how to get it running at all: https://wasmlabs.dev/articles/python-wasm32-wasi/

Building pybis

In general, platform tags like linux_arm64 have two parts: the "OS" and the "ISA". WASM itself is an ISA, like arm64; then there are different environments, that are like the host OS. Currently, CPython is targeting two environments: "emscripten", for browsers, and WASI, which is like a regular desktop/server OS but with more sandboxing. So I guess we could have pybis like cpython-3.11.0-emscripten.pybi and cpython-3.11.0-wasi.pybi (or emscripten_wasm and wasi_wasm but that seems redundant since these always use the WASM ISA).

There's also apparently a WASI competitor called WasmEdge, which has better socket support, and the wasmlabs.dev folks above are also building CPython targeting it? So maybe that's yet another platform?

For WASI + WasmEdge, it's normal to expose the host filesystem to the wasm binary, so the distribution layout can be similar to native cpython distributions. So we can probably build regular pybis for these, and for posy's purposes, we can probably treat these as Just Another Platform that is "native" everywhere.

For emscripten, there's some kind of virtual in-memory filesystem and I have no idea how it works.

Running pybis

To run a wasm binary you need a wasm runtime. The one I see people mention most often for WASI is wasmtime, which is written in rust, so it's easy to grab a binary (or even link into posy itself if we're feeling spicy). For WasmEdge it's... WasmEdge. So to some extent, once you have a python.wasm, you just do <runtime> <path to python.wasm> to run it, and posy just needs to know to add that extra step when launching a wasm binary. But! To make life more interesting, there are complications: you also have to tell the runtime what system resources to expose inside the wasm sandbox, so e.g. path mappings, stuff like that. Maybe posy would need special environment configuration keys for wasm environments to control this stuff?

For emscripten, it's... a browser I guess? Or maybe node? I don't know how you run an emscripten binary from the CLI, or if emscripten users even want to.

Installing py3-none-any wheels

Probably not a big deal? We'd need to make sure that the unpacked wheel contents are accessible to the wasm environment, which will require passing some appropriate settings to the runtime, but I assume that's more-or-less straightforward.

Installing binary wheels

I have no idea how this would work. Can you even dlopen from wasm? I know pyodide has builds of numpy etc., so there's some way to support extensions, but are they built and distributed separately or do they have to be compiled into the interpreter itself?

Building wheels from sdists (PEP 517)

lol I have even less idea how this would work. You're always cross-compiling, so maybe you need a new interface to the build backend to tell it "hey I want a build for this platform"? (unless you want to drop a copy of clang inside a WASI environment? that sounds silly but maybe it's not? though we'd at least need a way to spawn subprocesses from WASI...)

brettcannon commented 1 year ago

So I guess we could have pybis like cpython-3.11.0-emscripten.pybi and cpython-3.11.0-wasi.pybi (or emscripten_wasm and wasi_wasm but that seems redundant since these always use the WASM ISA).

This is already handled via a platform triple by compilers, e.g. wasm32-wasi and wasm32-emscripten. See https://github.com/python/cpython/tree/main/Tools/wasm for details.

There's also apparently a WASI competitor called WasmEdge

It's not a competitor to WASI, but a WASI runtime. They just happen to provide experimental/custom support for some things beyond WASI.

And there are no plans to support anything outside of WASI as that's speculative and not a standard by default in CPython itself. If people want custom build like what VMWare is doing that's their call, but it's probably for some edge compute/hosting reasoning they are doing what they are doing.

For emscripten, there's some kind of virtual in-memory filesystem and I have no idea how it works.

I wouldn't worry about Emscripten as that's a browser target.

For emscripten, it's... a browser I guess? Or maybe node? I don't know how you run an emscripten binary from the CLI

Typically you use Node.

We'd need to make sure that the unpacked wheel contents are accessible to the wasm environment, which will require passing some appropriate settings to the runtime, but I assume that's more-or-less straightforward.

https://snarky.ca/testing-a-project-using-the-wasi-build-of-cpython-with-pytest/

Can you even dlopen from wasm?

Emscripten has a hack, WASI has no support.

Building wheels from sdists (PEP 517)

You don't. At least for WASI you're just looking for pure Python wheels. Emscripten requires a bunch of work and that will require talking to the Pyodide folks since Emscripten has no version comptability guarantees (hence why there's an emscripten-forge).