bytecodealliance / wasmtime-py

Python WebAssembly runtime powered by Wasmtime
https://bytecodealliance.github.io/wasmtime-py/
Apache License 2.0
400 stars 52 forks source link

Add examples that call a complex pre-compiled library #67

Open simonw opened 3 years ago

simonw commented 3 years ago

I'm interested in using wasmtime to call out to complex originally-built-in-C libraries from Python - things like ffmpeg or the various image compression libraries used by https://squoosh.app/ (e.g. https://squoosh.app/c/mozjpeg_enc-f6bf569c.wasm)

It would be really cool if the documentation or examples for wasmtime included an example of that kind of usage!

alexcrichton commented 3 years ago

Thanks for the report! Unfortunately though this won't be a super clear answer and I'm not sure if there's a ton we can add to documentation, but that's not to say that nothing can be done.

The general way that this would be done today is application-specific. Similar to the web you'd probably invent a coupling between the host and the wasm module itself, and this coupling would be some bespoke protocol of some form that the embedder of the wasm file follows as well as the wasm file itself. This means the source-compiled-to-wasm typically has per-embedding code (e.g. JS vs Wasmtime) or something like that. Similarly the embedding (Python in this case, or JS for the web) is in on the scheme and knows how to precisely interact with the module in question.

At this time there's general conventions that wasm modules tend to follow, but there's no hard-and-fast rule about how wasm modules are expected to interact with the host. This means that off-the-shelf precompiled libraries tend to be difficult to move between embeddings because they're so tied to one runtime or another. Libraries also change the way they interact with the host over time as they continue to evolve.

The good news though is that we're working on a standard way for libraries to work with the host with a richer type system than numbers plus linear memory. The interface types proposal effectively gives modules things like string types, record types, etc. Memory management is all handled internally as an implementation detail so embedders and wasm authors have much fewer headaches about interoperating.

For helping to devlop that proposal we have the witx-bindgen project which is a way of sort of using interface types today with polyfills. This "compiles down" to what would otherwise be there with interface types, and gives a good preview of what the developer experience will generally look like. There's a little online demo where you can preview what the wasmtime-py bindings would look like to consume a particular wasm module with a certain interface. This same witx-bindgen would be used both on the host (with Python) and the wasm module itself (e.g. via the C headers or the Rust generation). This still requires source-code modifications but the vision of interface types is that precompiled libraries using interface types will be widely availble in the future so you will be able to pull one of those off the shelf and use it in your app.

In any case that's a bit of a long answer and probably isn't as great as you were expecting. That being said much of this isn't written down in documentation just yet (partly because it's still somewhat in flux) but if you've got ideas of how we could encode this into the docs or improve the situation I'd love to hear them!

Additionally if you've got some more details about your specific use case I can try to help out as well with something that may be more actionable depending on what you're doing.

b4stien commented 1 year ago

As a starter we could add an example that calls a "simple" pre-compiled library. All the examples in the repo are about loading a "wat" text file then compiling this file and loading it afterwards. Are people really doing that?

I currently have a pretty textbook-example of why someone would use Webassembly (I think): to have a single library of functions written in a compilable-to-Webassembly language (eg: Rust) available in different execution contexts (front-end in JS, back-end in Python, etc). From a beginner's eye the state of documentations (at large) is pretty rough right now.

And the roughest part is on Python runtimes for Webassembly:

I'd be glad to help build and maintain this documentation, if I could just get a grasp of the ecosystem beforehand.

alexcrichton commented 1 year ago

I think unfortunately the ecosystem isn't really at the point yet where this documentation can be written and maintained. This library provides access to low-level wasm details which is why examples and such are written in *.wat. It's assumed that users know how to connect a source language to a WebAssembly module to the concepts in this crate. I realize, though, that this assumption is not true for most, nor should it have to be true for most. I don't think anyone is using *.wat "seriously" other than for tests/debugging/etc.

The documentation I think you're looking for is likely a "level up" from what the library currently supports. This lifting in abstraction layer is the goal of the component model, evolved from the interface types specification I mentioned above. Initial preview support is available as python -m wasmtime.bindgen as mentioned in the README. This is demo-level support and not done yet. Not only are the Python bits not done yet but the component model itself is not done yet. We're working on it.

Once the component model is itself more finalized and the support here is more finalized that would be a great time to add documentation for how to put everything together. Before though I would fear that a lot of energy would be spent on maintaining documentation.

b4stien commented 1 year ago

Understood, thank you for this detailled answer. And many thanks for your work on those projects, looking forward to the day where all of this feels natural.