Linking/Sharing story - Githubissues

martinfouilleul commented 11 months ago

For now each app is compiled to a single module. As we extend the core library that is available to application developers (things like canvas API, UI system, etc, that run in "userland"), and as applications made for Orca get more ambitious, the need will arise for modularizing functionality and sharing resources between modules. There is no real linking story for wasm right now. The most likely candidate seem to be the Component model proposal: https://github.com/WebAssembly/component-model. However, it seems it won't be adopted until some time, is still subject to change, and doesn't seem to address the same problems and satisfy the same constraints as Orca. So imo we need to come up with a linking / sharing story of our own without waiting for the component model proposal to solve our needs.

There are several problems to solve for multi-module orca applications. Here are some thoughts to start with:

Statically linking wasm objects into a single module

We don't have much control on this one since there isn't really a specification for a wasm "object file" and currently it would really depends on the toolchain(s) that produced these objects. However since so many langages use llvm as a backend there might not be such a great diversity after all? So we can at least hope to distribute some of the core library as relocatable object files produced by clang, and have it be reasonably compatible with applications written in another language (at least we know it works for C++ and Zig).

Dynamically loading/Instantiating modules and calling into their exported functions

Could probably expose some runtime API for loading/instantiating and getting external function refs from a module. Unclear if we want to go down the whole subtyping road though? It's also tied to the question of sharing data (see below), eg if we want to pass more elaborate types than primitive types:

How do we copy or share aggregate value arguments from one module to another when calling a extern function?
How do we ensure that value is interpreted the same way on both sides?

Sharing data between different modules.

Ideally we want a virtual memory model where we can map the same physical pages to two different wasm modules. Each module can put it in its own address space where it makes sense.
Modules written in different langages (or the same language using different toolchains) are going to need some form of marshalling to and from a canonical layout. In an ideal world, the "Orca ABI" declares the layout of types that can be passed across modules boundaries and across the host/guest boundary, and compilers deal with that, but that's not going to happen anytime soon. The same effect is somewhat achievable by providing an IDL that's then used to generate marshalling code, but trying to accommodate all kinds of whacky type systems is basically a world of pain...

Imo, for now we shouldn't strive to solve for langage interop. Instead, we can define and document our ABI, and language bindings are responsible for marshalling their representation of data into ours when they want to talk to Orca APIs.

How does dynamic linking plays out with the permission system?

Presumably permissions would be inherited, and the instantiating module could drop some permission for the child module to restrict its access if needed.

rdunnington commented 11 months ago

re: static linking. If we run into problems in the future with different language toolchains dealing with relocatable object files, I think it should be possible to do this after an object file has been compiled to wasm because the format is well-defined and independent of compilers. To combine two wasm files, you basically need to remap the first module's primitives (types, functions, globals, tables, datas, imports, exports, etc) to indices/memory addresses that don't conflict with any in the second module. Then you'd need to go through the bytecode in all the functions and update the immediates to point to the remapped values. It would be a lot of work but it sounds possible on it's face.

martinfouilleul commented 11 months ago

It wouldn't be super hard to do some linking of wasm code ourselves. The part I'm more worried about is carrying over and combining all debug/symbol/type information when linking modules (esp. since these infos are not really specified at the moment).

rdunnington commented 11 months ago

Ah yeah debug info does make it more complicated. If we did some preprocessing to build our own debug format and insert it into a custom section, I think it could be designed to be mergable.

orca-app / orca

Linking/Sharing story #33

Statically linking wasm objects into a single module

Dynamically loading/Instantiating modules and calling into their exported functions

Sharing data between different modules.

How does dynamic linking plays out with the permission system?