Open ajalt opened 2 months ago
Hi @ajalt , thank you for raising the issue. At the time I implemented the WASM bindings, I did look into supporting WASI as well but it was never clear to me (this was a few years ago) how to connect two independent WASM modules together within wasmtime
. Since that is essential for working with Pdfium, and since binding two modules together in the browser is (relatively) straight-forward using wasm_bindgen
, I focussed on the browser-based implementation.
I would be very happy to revisit this as I would like to support wasmtime
as well. But I would need some pointers as to how to get two independently-compiled WASM modules to talk to each other in wasmtime
.
Emscripten has some docs on dynamic linking. One solution they mention is to include the library in the wasm filesystem, then link to it using regular dlopen
.
Another less elegant option if that doesn't work might be to link manually by loading pdfium and pdfium-render in separate wasm instances, and importing all the pdfium exports explicitly when setting up the wasm engine:
let pdfium_render_import = imports! {
"env" => {
"PDFium_Init" => pdfium_library_exports.get_function("PDFium_Init")?,
"FPDF_InitLibraryWithConfig" => pdfium_library_exports.get_function("FPDF_InitLibraryWithConfig")?,
// etc...
}
};
If dlopen is indeed available, then that suggests that pdfium-render's default dynamic bindings should be workable. But how are you proposing to build Pdfium? Those Emscripten docs suggest that each module to be linked needs to be built with specific command-line parameters. Are you planning to build Pdfium yourself?
How were you envisioning this would work at runtime? Were you planning on having a Rust wrapper that would perform the module linking for you, a la https://docs.wasmtime.dev/examples-rust-linking.html, or did you intend to load the modules directly into wasmtime
from the command line? How would that work?
I think I would need to see a minimal example of two modules linked together - doesn't need to be anything to do with Pdfium, just any two demo modules where module1::main()
calls a test function exported by module2
- in order to proceed with any large scale Pdfium bindings implementation.
Yes, I compiled pdfium against WASI myself. It just requires some extra command line flags vs wasm-js.
If you clone the pdfium-lib project, this patch should get it to compile against WASI (just follow their regular wasm build instructions after patching):
diff --git a/modules/wasm.py b/modules/wasm.py
--- a/modules/wasm.py
+++ b/modules/wasm.py
@@ -657,6 +657,11 @@ def run_task_generate():
"ASSERTIONS=1",
"-s",
"ALLOW_MEMORY_GROWTH=1",
+ "-s",
+ "STANDALONE_WASM=1",
+ "-sWASM_ASYNC_COMPILATION=0",
+ "-sWARN_ON_UNDEFINED_SYMBOLS=1",
+ "-sERROR_ON_UNDEFINED_SYMBOLS=0",
"-sMODULARIZE",
"-sEXPORT_NAME=PDFiumModule",
"-std=c++11",
Then you should be able to make a rust cdylib
crate compiled against wasm-wasi
that imports those functions, for example:
extern "C" {
fn FPDF_InitLibrary();
}
#[no_mangle]
pub fn main() {
unsafe {
FPDF_InitLibrary();
}
}
Then you should be able to use the Linker
from the wasmtime docs you posted to run them together.
I imagine Emscripten's dlopen
support involves baking the loading into the binary, so that's probably not available from regular Rust. It doesn't look like the libloading
crate supports wasm, for example.
Many thanks. And does that Linker
example from the wasmtime docs itself get compiled to wasm, or is it built as a standard Rust executable? I feel like I need an ELI5 of what the end-to-end flow is meant to be for this so I can understand (a) the motivation and (b) how it could work with pdfium-render
.
The linker example is compiled to a regular rust binary: wasmtime is the engine/runtime for the wasm code you've compiled.
basically you would compile three things:
--target wasm-wasip1
Step 3 doesn't need to be in rust; you could use any other wasmtime bindings like python or go.
Alright, that broadly makes sense to me.
If you would like me to lead the work on this, then you should prepare to wait; this would be a backlog item that would not start until after 0.9.0.
If you want to make a start on it yourself, I would be very happy to support you. Once it's obvious to me how to link the modules together, and what the calling mechanism would be on the pdfium-render
side, I can probably pretty easily flesh out all the bindings. But a worked example from someone who is actually motivated to drive the feature would be very helpful.
Hi, thanks for making this great library!
I'm trying to use this library on the
wam32-wasip1
target to run on a standalone Wasm runtime like Wasmtime. I can compile against that target, but it looks like all the code insrc/wasm.js
depends on the JS shims generated by wasm_bindgen, which aren't available on WASI.Assuming I already have a WASI-compiled pdfium library, do you have an idea of what it would take to link pdfium-render to it? Is that even possible, or would it require changes in pdfium-render?