gamercade-io / gamercade_console

A Neo-Retro Fantasy Console. Make WASM-powered, networked multiplayer games.
https://gamercade.io
Apache License 2.0
165 stars 10 forks source link

DATA PACKS: WASM Linear Memory, Rollback & Console Constraints #104

Closed RobDavenport closed 2 months ago

RobDavenport commented 3 months ago

Hey all,

I started looking at the core runtime and execution of the console. Specifically how we are handling rollback, linear memory, and just how WASM executes in general. I've come up to bit of a crossroads here on how to proceed given this issues:

  1. The console should be simple and easy to develop for, but still allow for deep optimization for those looking for it.
  2. The console needs to enforce limitations on the games, such as file size, maximum memory usage.
  3. WASM code can only access data within its own linear memory (and rightfully so) to prevent security issues.

Asset's are already "separated" from the game, in the sense that it doesn't take up any space of the VM's linear memory, it exists in the host code. And in fact, the actual data of the assets, frame buffer, sound effects, etc, aren't accessable from the wasm module at all.

The real problem arises from the following: Any code or binary assets are stored in the VM's linear memory. This is a single chunk of data to use for the actual compiled source code, the execution stack, and dynamic memory (heap, ram). This then means that a large .wasm file, which could just be complex code, or code with lots of embedded data, ends up taking a lot of this memory up. These bytes are then copied each time during a rollback.

Other WASM fantasy consoles (namely microw8 and WASM-4), seem to follow the approach of "everything in linear memory," where the host application is pushing bits and bytes around for things like controller input and frame buffers, and reading those bytes to draw to the screen. This also more closely follows actual hardware, where specific regions of memory are reserved for certain things. The downside with this approach, is that it requires a specific compiler/linking setup, by setting flags about max stack size, where to point the __heap_base, use imported memory, etc. This is the most "ideal" since it enforces rules 2 and 3 above, but it significantly increases the difficulty for the developer due to needing to understand all of those flags. In my tech demo 3d_next, I started running into rollbacks taking ~16ms or more when my memory usage was around 35mb. I'm not sure if this is a relevant performance issue or not though, but alas this memory will be constrained in the future.

Currently, the ROM file is limited to 16mb after it's compressed. This isn't a hard rule by the console, but it will be enforced on the platform side when it's released.

So what is this all about, anyway?

Data Packs would provide an additional binary blob for the rom to be used in the game code, but they shouldn't take up space in linear memory. They should be read-only and realtively high performance. This could allow developers to include whatever additionial data they wanted along with their project. Examples include: bundles of assets like text, art, or levels which need to be accessed through code, similar to .wad files from DOOM. Because these are just bytes, they can represent anything the developer needs. An entire asset library could exist as a single datapack, or the game engine could exist as wasm code, and the levels could be separate datapacks.

My solution currently is as follows:

103 has my first implementation of this.

Due to rule 3 above, it's impossible for the wasm module to ever access data outside of its own linear memory. Therefore, the datapack must be copied into linear memory. This is done via a datapack(len: i32) function, which is called before init(), and this function would return a pointer to a chunk of memory of length len bytes for the host application to copy those bytes into.

Rule 2 about limitations, are still somewhat enforced. The maximum size available for the game can be a function of the actual linear memory chunk, plus a few extra pages as needed to store the data_pack. This memory needs to be handled correctly and safely from the module (usually by just keeping a pointer around or leaking the data). And since this memory is initialized and written by the host application, it is able to skip this region during rollback.

And finally to rule 1. Data packs are specifically opt-in. The datapack function isn't necessary, and I don't see it as a necessity for the majority of projects. It would be an alternative for those who prefers working with code-based assets or trying squeeze out as much as possible from the system. Theoretically, people could use this datapack region of memory to hack around the memory limits (if they are not a function of console limits + datapack size), or manipulate those datapack bytes to have differing states (because that region isn't rolled back).

However, this does open the potential that: this memory could be corrupted, since there's no way for the wasm module or host app to make it read-only. This also could create desync issues, since the region is never supposed to be written to.

But I would love to hear any other ideas or solutions for this. It would be great if we could pass a read-only buffer between WASM and the host program, but it doesn't seem like that is possible at the moment. And speaking of limitations, do we have a consensus of what a good maximum memory to support for Gamercade? As mentioned, I ran into performance issues when trying to rollback ~35mb at once. The demo sits at a comfortable ~6.5mb now and works great. I'm thinking of limiting the system to either 8mb, 16mb, or 32mb of memory to work with.

I wonder if it's also relevant to explore something like a memory region API, where the module can specify specific regions of memory which shouldn't be rolled back. This is possible with the "larger than needed" datapacks, but declaring specific regions may make things simpler. Use cases for this could be specific scratch space which doesn't need to exist between frames (like a large Bump allocator per frame, or a maually managed frame buffer). This could really improve the performance at the cost of higher complexity for those who desire it.

Alternatively, we could write our own implementation of a minimal WASM VM, giving full control over these features. This may allow us to separate the code from the mutable state, and allowing that way to pass read-only memory to the VM. Of course, this would open a whole host of issues when it comes to performance, security, and maintenance.

RobDavenport commented 3 months ago

So I took another look at wasmtimes modules, and actually found a way to identify the size of the code module (in bytes) prior to instantiation! This means we can actually ignore that region of memory during rollback, in addition to skipping the datapack region. Using Module::image_range we can learn the size of the module before it is created.

This is a huge win because it opens up the possiblity of including assets embedded into the binary itself, but will also minimize the amount of state rolled back per-frame.

RobDavenport commented 2 months ago

Closing this as #98 adds functionality for the datapack api.