Open george-cosma opened 2 months ago
I've made some dummy benchmarks to see how much slower approach 2 ("Swarm") would be: https://github.com/george-cosma/indirection_bench
Proposed API changes:
// .-----------------------.
// | Single module example |
// '-----------------------'
const ADD_ONE: &'static str = r#" (module (func (export "add_one") (param $x i32) (result i32) local.get $x i32.const 1 i32.add ) )"#;
use wasm::{validate, RuntimeInstance, DEFAULT_MODULE};
fn main() { let wasm_bytes = wat::parse_str(ADD_ONE).unwrap(); let validation_info = validate(&wasm_bytes).unwrap(); let mut instance = RuntimeInstance::new(&validation_info).unwrap();
// `get_fn` will verify that the function "add_one" exists for module <DEFAULT_MODULE>.
// On success: return the identifier pair (module_name, function_name, module_id, function_id)
// On failure: RuntimeError -- couldn't find the function
let add_one = instance.get_fn(DEFAULT_MODULE, "add_one").unwarp();
// Also, to maintain compatability with index-based accessing (which can be useful in some edge cases, and for us it
// is useful for integration tests):
let add_one = instance.get_fn_idx(/* module_idx: */0, /* function_idx: */0).unwarp();
// On success: return the identifier pair (module_name, function_name, module_id, function_id)
// On failure: RuntimeError -- couldn't find the function
// `invoke` will verify that the function identifier is still valid (it wasn't created with an instance and ran on
// another). That is why we also store the module_name and function_name.
assert_eq!(12, instance.invoke(&add_one, 11).unwrap());
// Or should we do it this way? Or both?
assert_eq!(12, add_one.invoke(&instance, 11).unwrap());
}
- Multiple modules example:
```rust
// .--------------------------.
// | Multiple modules example |
// '--------------------------'
const ADD_ONE: &'static str = /* as above */;
const ADD_TWO: &'static str = r#"
(module
(import "add_one_module" "add_one" (func %add_one (param i32) (result i32)))
(func (export "add_two") (param $x i32) (result i32)
local.get $x
call %add_one
call %add_one
)
)"#;
fn main() {
let wasm_bytes = wat::parse_str(ADD_ONE).unwrap();
let validation_info = validate(&wasm_bytes).unwrap();
let mut instance = RuntimeInstance::new_named("add_one_module", &validation_info).unwrap();
let wasm_bytes = wat::parse_str(ADD_TWO).unwrap();
let validation_info = validate(&wasm_bytes).unwrap();
instance.add_module("add_two_module", &validation_info).unwarp();
let add_two = instance.get_fn("add_two_module", "add_two").unwarp();
// Alternative:
let add_two = instance.get_fn_idx(1, 0).unwarp();
assert_eq!(13, instance.invoke(&add_two, 11).unwrap());
// Or should we do it this way? Or both?
assert_eq!(13, add_two.invoke(&instance, 11).unwrap());
}
The Linker Problem
To fully implement a working wasm interpreter we must be able to resolve imports. The idea of a
Linker
comes to mind. There are multiple ways to design it.Design 1: Monolithic Runtime Instance
This design would entail collecting all validation info from all of the modules into the Linker, which will then produce a "merged" validation info which can then be instantiated as a sole RuntimeInstance. Import resolution would then be resolved universally, since a call to an imported function would actually be a regular call for the merged validation info.
✅ Pros:
❌Cons:
call
instructions will need their operands changed.Design 2: Each validation info has its own runtime instance (Swarm)
As the name and image suggests, each module gets its own runtime instance. Where the magic lies is actually inside the Linker, which, this time, is an entity which lives as long as the runtimes. When a module needs to call an imported function, it does so via the Linker.
✅ Pros:
❌Cons:
With this type of linker, there is an arhitectural problem we need to solve to maintain resumability. If Module 1 calls an imported function in Module 2, and Module 2 then calls an imported function from Module 1, at the end of this chain Module 1 must be able to resume code properly.
Here is as example how an import call could work:
Now, how it would work in the described scenario:
Notice that the second
Store PC
would overwrite the previously stored program counter. An intuitive solution would be to make it a stack, but that feels like it would create more problems than it solves. An alternative solution would have thecall
instruction create a callframe not only on the caller module, but also on the called module. Or something like that. There are solutions, but I do not know which is the correct one.I'd like to continue this discussion. I want to know your opinions in regards with which approach to go with. I personally believe the second "swarm" approach is more appropriate, but that is based more on vibes.