The Linker Problem

To fully implement a working wasm interpreter we must be able to resolve imports. The idea of a Linker comes to mind. There are multiple ways to design it.

Design 1: Monolithic Runtime Instance

linker_problem_monolith drawio

This design would entail collecting all validation info from all of the modules into the Linker, which will then produce a "merged" validation info which can then be instantiated as a sole RuntimeInstance. Import resolution would then be resolved universally, since a call to an imported function would actually be a regular call for the merged validation info.

✅ Pros:

Import Resolution solved at link-time. No performance overhead.
The architecture of the code will remain mostly unchanged
Linker will live for a very short amount of time (long enough to process the validation infos)

❌Cons:

Code sections will (likely) need to be modified due to index namespace merger. A call to function "12" is trivial with a single module, but what about if we merge two modules? Module 1 imports a function as index 1, and Module 2 exports this function with index 8. All function indices will then have to be reshuffled and the appropriate call instructions will need their operands changed.
Same as above, but for globals, tables, memories, etc.
Import Resolution solved at link-time. This would prevent any sort of design where equivalent modules could be swapped at runtime.

Design 2: Each validation info has its own runtime instance (Swarm)

linker_problem_swarm drawio

As the name and image suggests, each module gets its own runtime instance. Where the magic lies is actually inside the Linker, which, this time, is an entity which lives as long as the runtimes. When a module needs to call an imported function, it does so via the Linker.

✅ Pros:

Validation Info & Runtime creation is simple
Closer in style to the WASM Component Model (TODO: Check specification)
Allows for the ability to "hot swap" modules at runtime. Let's say the infotainment system is decoupled (intentionally or not). Then the Linker could be ordered to replace the module responsible for communicating with it, with a dummy module that fails instantly instead of waiting for a timeout.

❌Cons:

Performance overhead due to import resolution at runtime, especially if we tackle the threading proposal.
The fuel mechanic is non-trivial
Lots of arhitectural changes
Linker is long-running
A very specific problem with resumability (continue reading)

With this type of linker, there is an arhitectural problem we need to solve to maintain resumability. If Module 1 calls an imported function in Module 2, and Module 2 then calls an imported function from Module 1, at the end of this chain Module 1 must be able to resume code properly.

Here is as example how an import call could work:

linker_problem_rel1 drawio

Now, how it would work in the described scenario:

linker_problem_rel2 drawio

Notice that the second Store PC would overwrite the previously stored program counter. An intuitive solution would be to make it a stack, but that feels like it would create more problems than it solves. An alternative solution would have the call instruction create a callframe not only on the caller module, but also on the called module. Or something like that. There are solutions, but I do not know which is the correct one.

I'd like to continue this discussion. I want to know your opinions in regards with which approach to go with. I personally believe the second "swarm" approach is more appropriate, but that is based more on vibes.

Proposed API changes:

One module exmaple:


// .-----------------------.
// | Single module example |
// '-----------------------'

const ADD_ONE: &'static str = r#" (module (func (export "add_one") (param $x i32) (result i32) local.get $x i32.const 1 i32.add ) )"#;

use wasm::{validate, RuntimeInstance, DEFAULT_MODULE};

fn main() { let wasm_bytes = wat::parse_str(ADD_ONE).unwrap(); let validation_info = validate(&wasm_bytes).unwrap(); let mut instance = RuntimeInstance::new(&validation_info).unwrap();

// `get_fn` will verify that the function "add_one" exists for module <DEFAULT_MODULE>.
// On success: return the identifier pair (module_name, function_name, module_id, function_id)
// On failure: RuntimeError -- couldn't find the function
let add_one = instance.get_fn(DEFAULT_MODULE, "add_one").unwarp();

// Also, to maintain compatability with index-based accessing (which can be useful in some edge cases, and for us it
// is useful for integration tests):
let add_one = instance.get_fn_idx(/* module_idx: */0, /* function_idx: */0).unwarp();
// On success: return the identifier pair (module_name, function_name, module_id, function_id)
// On failure: RuntimeError -- couldn't find the function

// `invoke` will verify that the function identifier is still valid (it wasn't created with an instance and ran on
// another). That is why we also store the module_name and function_name.
assert_eq!(12, instance.invoke(&add_one, 11).unwrap());
// Or should we do it this way? Or both?
assert_eq!(12, add_one.invoke(&instance, 11).unwrap());

}


- Multiple modules example:
```rust
// .--------------------------.
// | Multiple modules example |
// '--------------------------'

const ADD_ONE: &'static str = /* as above */;
const ADD_TWO: &'static str = r#"
(module
    (import "add_one_module" "add_one" (func %add_one (param i32) (result i32)))
    (func (export "add_two") (param $x i32) (result i32)
        local.get $x
        call %add_one
        call %add_one
    )
)"#;

fn main() {
    let wasm_bytes = wat::parse_str(ADD_ONE).unwrap();
    let validation_info = validate(&wasm_bytes).unwrap();
    let mut instance = RuntimeInstance::new_named("add_one_module", &validation_info).unwrap();

    let wasm_bytes = wat::parse_str(ADD_TWO).unwrap();
    let validation_info = validate(&wasm_bytes).unwrap();
    instance.add_module("add_two_module", &validation_info).unwarp();

    let add_two = instance.get_fn("add_two_module", "add_two").unwarp();
    // Alternative:
    let add_two = instance.get_fn_idx(1, 0).unwarp();

    assert_eq!(13, instance.invoke(&add_two, 11).unwrap());
    // Or should we do it this way? Or both?
    assert_eq!(13, add_two.invoke(&instance, 11).unwrap());
}

DLR-FT / wasm-interpreter

The never-ending Linker problem #83

The Linker Problem

Design 1: Monolithic Runtime Instance

Design 2: Each validation info has its own runtime instance (Swarm)