MarkMcCaskey commented 1 year ago

I saw some interest in adding WebAssembly (Wasm) as a scripting language for mods so I thought I'd write up a doc to go over what that will entail and start the discussion of designing that system if it is in fact a good fit for this project. The doc is still a bit of a work in progress, so please excuse the sloppiness.

Soh WebAssembly Scripting

This doc is an overview of the process of adding WebAssembly (Wasm) scripting to soh and some of the trade offs and challenges.

Motivation

WebAssembly is an appealing choice for scripting support due to it being an existing technology that is both portable and widely supported by many languages.

Alternatives

JavaScript, Lua, etc

Pros:

Easy. Embeddable interpreters are stable and can be added relatively easily
Simpler scripting process than compiling Wasm as you just write scripts and put them in the right place
Can be sandboxed

Cons:

Limited to languages supported

Native Code

Pros:

Best possible performance (CPU, memory access)
Allows any language to be used, but with more effort

Cons:

Not portable
Difficult to build
Lack of sandboxing or security

Overview

WebAssembly is a bytecode language that operates similarly to a stack-based language with structured control flow. It is completely sandboxed by design. WebAssembly on its own can only do pure computations; "Imports" must be provided by the host environment to allow the Wasm code to call back into the host to make anything happen.

WebAssembly programs have their own Memory(s) and do not have access to any host memory. Therefore, we must take care when designing the ABI between host and guest to avoid excessive memory copying for performance-sensitive tasks.

There are many embeddable WebAssembly implementations but there exists a universal C API so that any runtime can be slotted in. However this is a bit of a lowest common denominator API that is likely significantly less ergonomic and powerful than any given runtime's API. However by implementing support for this universal API we could support a Wasm interpreter for portability and allow users to opt-in to heavier JIT/AOT compiled Wasm runtimes. That said, the limitations of the universal API may not be tenable depending on the computation model the mods will use. Concretely, the universal C API does not appear to have a mechanism to interrupt execution, so mods may spin forever (TODO: look into this more).

Runtime Trade Offs

This is a very high level overview of the trade offs of the categories of Wasm runtimes. Each individual Wasm runtime has its own trade offs.

Wasm Interpreter

Pros:

Less code, simpler code
- fewer bugs
- more portable
- faster build times

Cons:

Performance: i.e. Running JS inside QuickJS.wasm as a mechanism to allow Wasm scripting would have twice the performance overhead of interpretation

Wasm Jit/Aot

Pros:

"Near native" performance under proper conditions

Cons:

More code, more complex
- More platform-specific support required
- More room for bugs and security issues
- Longer build times and potentially much heavier binary depending on the library/configuration chosen

Designing Mod Scope

The first task when designing a scripting system is to define the scope of what it will be used for. These choices will inform which trade offs we should make.

Questions that must be answered:

Can multiple mods run at the same time?
- If so, can they communicate with each other? And to what extent?
- If so, how are conflicts resolved? (i.e. 2 mods modifying the same value at the same time)
How much computation time do we predict that mods will need? Are mods mostly just if statements and get/sets?
Do mods need access to the file system or the network?

Script System Designs

The first part of the scope is the computational model by which we'll run the scripts. Here are some options:

Async Heavy-Weight, client-server-like

One possible design is to have the mod manage everything itself, it would operate fully asynchronously, like another thread that gets/sets data from the host through imports.

This is very generic and allows the mod to do arbitrary computation on its own. If things like network access are given, it could be used to allow arbitrarily complex and computationally demanding mods. The downside is that it's heavy-weight and async stuff can get messy and complicated.

Event-based system

Another possible design is to use an event based system where mods can subscribe to certain events including things like:

specific value changed
next frame
timer expired (delayed function calls)
custom events triggered by other mods
user input
user actions (game paused, reloaded, etc.)
game specific events like map loaded, item get/use, etc.

When an event happens, any script that subscribes to those events would have a function called where that mod can then either call back into the host with imports or return a value to the host to indicate what actions should be taken.

This model is very simple, especially if done synchronously. However care must be taken to ensure that mods don't block execution indefinitely. Some Wasm runtimes provide tools to assist with this.

ABI Design

The ABI (Application Binary Interface) is the system by which the host and guest (Wasm) will communicate with each other. Wasm on its own only supports basic number types like i32, i64, f32, and f64. Therefore things like passing strings or other interesting data between the host and guest are non-trivial and we must decide on an ABI to make it happen.

Because the host and the guest do not share memory, we must also consider memory management. One option is to have the scripts provide their own malloc and free functions so that the host can manage memory inside of the guests when it needs to pass data to the guest. Another option is to design APIs that avoid memory management. This pattern can be seen with some system calls where functions must be called multiple times with a fixed buffer size to get all the data.

Putting it all together, the system will look something like Host <--> Host implementation of imports / translation of ABI into meaningful operations <-(our ABI)-> Language-specific library providing idiomatic bindings to our ABI <--> Wasm script written by a user.

To demonstrate the concepts and what this looks like with specifics, see the following strawman example:

Wasm exports:

(func $malloc (param i32) (result i32))
(func $free (param i32))
(func $event_level_changed (param i32))
(func $init_mod)

Wasm imports:

(func $get_variable (param i32) (result i32))

Shared ABI header:

struct LevelInfo {
    u32 level_id;
    // pointer to the name, nul-terminated in this case.
    u32 level_name;
    ...
}

enum Variable {
    PlayerHealth = 0,
    ...
};

Host logic:

// set up
WasmModule wasm_module = wasm("mod.wasm");
Instance wasm_instance = wasm_module.instantiate(&imports);
WasmFunc wasm_init = wasm_instance.get("init");
wasm_init();

// when level changes
WasmFunc event_level_changed = wasm_instance.get("event_level_changed");
WasmFunc wasm_malloc = wasm_instance.get("malloc");
WasmFunc wasm_free = wasm_instance.get("free");

string level_name = get_level_name();
u32 level_id = get_level_id();
...

u32 wasm_level_name = wasm_malloc(level_name.len());
u32 wasm_level_data = wasm_malloc(sizeof(LevelInfo));

instance.wasm_write(wasm_level_name, &level_name, level_name.len());
LevelInfo info = LevelInfo {
    level_id,
    // not a real char*, it's a u32 that is a pointer to Wasm memory
    level_name: wasm_level_name,
};
instance.wasm_write(wasm_level_data, &info, sizeof(LevelInfo));
...

event_level_changed(wasm_level_data);

wasm_free(wasm_level_name);
wasm_free(wasm_level_data);

Example of a Rust wrapper around the ABI on the guest side:

...

// this comes from the imports when a module is instantiated
extern "C" unsafe fn get_variable(var: Variable) -> i32;

#[repr(C)]
struct LevelInfo {
    level_id: u32,
    // raw C str
    level_name: *const u8,
}

#[derive(Debug, Clone, ...)]
struct UserFriendlyLevelInfo {
    level_id: u32,
    level_name: String,
}

#[repr(u32)]
enum Variable {
    PlayerHealth,
}

#[no_mangle]
extern "C" unsafe fn wasm_level_changed(info: &LevelInfo) {
    let c_str = std::ffi::CStr::from_ptr(info.level_name);
    let rust_str = c_str.to_str().expect("utf-8 string");

    let rust_info = UserFriendlyLevelInfo {
        level_id: info.level_id,
        level_name: rust_str.to_string(),
    };

    // call user function here, could be done by initing a global table of user callbacks or
    // this wrapper code could be provided to users directly to be embedded into their own code.

    ...
    let hp = get_variable(Variable::PlayerHealth);

    ...
}

/// Naive malloc
#[no_mangle]
extern "C" fn malloc(size: u32) -> Option<Box<[u8]>> {
    if size == { return None }
    let mem: Vec<u8> = Vec::with_capacity(size as usize);

    Some(mem.into_boxed_slice())
}

/// Naive free, just use RAII
#[no_mangle]
extern "C" fn free(mem: Option<Box<[u8]>>) {}

/// This could also be the main function instead, calling main / an empty main would set up constructors etc
/// most languages typically need code to be run before you can just call into their functions directly.
#[no_mangle]
extern "C" fn init_mod() {}

From this example we can see how the host and guest can call each other and pass data.

Designing an appropriate ABI is best done iteratively as requirements of scripting are made clear.

However shortcuts can also be taken to avoid spending time on this step. For example, rather than carefully designing an ABI, we could simply pass JSON bidirectionally as an allocated string.

Note on ABIs: Some tools exist to automate some of this process of wrapping and unwrapping data types at the boundaries, however as far as I'm aware they're still mostly experimental or language specific. For best results in the short term, we can just do it manually.

TODO: the rest of the doc

fossifousacid commented 9 months ago

Hi,

I've made a little proof-of-concept mod for scripting with Lua, although I absolutely believe that WASM is the right way to go in the future. I mainly wanted a way to quickly iterate on mods without needing to recompile each time. I can register the events I want to be able to handle in a script from C++ and use the FFI in LuaJIT to modify the game's state using the functions exposed through the C ABI.

I want to note that running mods asynchronously can make the barrier to development a lot higher, especially considering that most of the game's logic was designed to be run in a single-threaded context. Additionally, most of the benefits of asynchronous environments tend to come up in IO-bound operations. I suspect most of the heavy IO should be handled by the engine anyway.

Error handling and debugging capabilities are also important to consider. Debugging Lua or native code is easy enough, I'm not sure how well developed the tools are for debugging WASM. https://rustwasm.github.io/docs/book/reference/debugging.html says that most of the available debugging tools are immature.

fossifousacid commented 9 months ago

Recap of some discussion in the discord scripting thread

Resource limiting is not a concern. If a mod OOMs, deadlocks or goes into an infinite loop... don't use that mod.
Be careful not to leak anything that can modify the user's file system or allow for arbitrary code execution.
SWIG would have been nice to use to generate APIs and ABIs directly from C++. It has significant bitrot now though.
Don't use C++ exceptions across the library boundary
- Backport std::expected?
  
  Most mods should be able to operate in a single threaded context. Most of the code mods that exist so far are not particularly demanding of the CPU. I'm not familiar with what some of the original ROM mods would have done. Most IO bound operations are handled by the engine, such as networking, graphics and audio.

I think it's worthwhile actually looking at the available WASM VMs. I'm currently looking at WebAssembly Micro Runtime (WAMR)

As far as passing types between languages goes, https://flatbuffers.dev/ might be of interest. It has significantly less weight than protobuf.

HarbourMasters / Shipwright

WebAssembly scripting overview / proposal #3031

Soh WebAssembly Scripting

Motivation

Alternatives

JavaScript, Lua, etc

Native Code

Overview

Runtime Trade Offs

Wasm Interpreter

Wasm Jit/Aot

Designing Mod Scope

Script System Designs

Async Heavy-Weight, client-server-like

Event-based system

ABI Design

Backport std::expected?