zksecurity / wasmati

Write low-level WebAssembly, from JavaScript
MIT License
219 stars 6 forks source link

Generating WASM for TIC-80 #7

Open ion1 opened 3 months ago

ion1 commented 3 months ago

Thanks for the project!

I'm trying to generate WASM for TIC-80. I have encountered the following issues while trying to generate a hello world using wasmati:

Apologies if I have missed how to do these things correctly.

A hello world in WAT ```wat (module (import "env" "memory" (memory 4 4)) (import "env" "cls" (func $cls (param i32))) (import "env" "print" (func $print (param i32 i32 i32 i32 i32 i32 i32) (result i32))) (export "TIC" (func $TIC)) (data (i32.const 0x18000) "Hello world!\00") (func $TIC (call $cls (i32.const 13)) (drop (call $print (i32.const 0x18000) (i32.const 84) (i32.const 61) (i32.const 15) (i32.const 0) (i32.const 1) (i32.const 0))) ) ```
My closest attempt at implementing it using wasmati ```ts import { call, drop, func, i32, importFunc, importMemory, Module, } from "wasmati"; import { writeFileSync } from "fs"; function main() { const memory = importMemory({ min: 4, max: 4 }); const cls = importFunc({ in: [i32], out: [] }, () => {}); const print = importFunc( { in: [i32, i32, i32, i32, i32, i32, i32], out: [i32] }, () => {} ); const TIC = func({ in: [], out: [], locals: [] }, ([], [], _ctx) => { call(cls, [13]); call(print, [0, 84, 61, 15, 0, 1, 0]); drop(); }); const module = Module({ exports: { TIC }, memory, }); const wasm = module.toBytes(); writeFileSync("dist/cart.wasm", wasm); } main(); ```
The output from the above ```wat (module (type $0 (func (param i32))) (type $1 (func (param i32 i32 i32 i32 i32 i32 i32) (result i32))) (type $2 (func)) (import "" "m0" (memory $mimport$0 4 4)) (import "" "f0" (func $fimport$0 (param i32))) (import "" "f1" (func $fimport$1 (param i32 i32 i32 i32 i32 i32 i32) (result i32))) (export "TIC" (func $0)) (func $0 (call $fimport$0 (i32.const 13) ) (drop (call $fimport$1 (i32.const 0) (i32.const 84) (i32.const 61) (i32.const 15) (i32.const 0) (i32.const 1) (i32.const 0) ) ) ) ) ```
mitschabaude commented 3 months ago

Hey @ion1! Happy to help you :)

Declaring a data section

For declaring a data section, wasmati has the data() constructor. Here's how you can use it to do the same as in your WAT hello world:

import { data, Const } from "wasmati";

let enc = new TextEncoder();

data({ memory, offset: Const.i32(0x18000) }, enc.encode("Hello world!\x00"));

Note: the memory argument is needed to connect the data section to this particular memory and implicitly make it a dependency of the module (wasmati wouldn't add it to the module if it's not a dependency)

Note to self: It's a bit annoying that we have to use Const.i32 instead of just a number for the offset, I'm going to look into changing that. Also, wondering if I could optionally allow a string instead of bytes as content, and utf8-encode it internally.

Custom import specifiers

The ask about specifying the import names is interesting and I didn't foresee that this would be ever needed.

Note that if you're writing the entire Wasm module with wasmati, wasmati will allow you to call all importFunctions in other functions, and it will correctly resolve the import.

I thought that only this "internal consistency" is needed, so I'd be curious to learn more about the requirements of TIC-80 you talk about. How is that wasm module going to be integrated/combined with other wasm code, such that the import name makes a difference? Or maybe it's just about how the import object looks like when instantiating the wasm?

Anyway, I did model import paths as part of the type of any imported object in wasmati. I didn't expose a direct way of setting paths in the constructor, but it's possible right now to set your own import paths manually as follows:

import { Dependency, importMemory, importFunc, i32 } from "wasmati";

// helper function to set import paths
function setImportPath(
  input: Dependency.AnyImport,
  module: string,
  path: string
) {
  input.module = module;
  input.string = path;
}

const memory = importMemory({ min: 4, max: 4 });
setImportPath(memory, "env", "memory");

const cls = importFunc({ in: [i32], out: [] }, () => {});
setImportPath(cls, "env", "cls");

const print = importFunc(
  { in: [i32, i32, i32, i32, i32, i32, i32], out: [i32] },
  () => {}
);
setImportPath(print, "env", "print");

Conclusion

With those two changes I get a Wasm module that looks equivalent to your hello world example. Let me know if you have more questions!

(module
  (type (;0;) (func (param i32)))
  (type (;1;) (func (param i32 i32 i32 i32 i32 i32 i32) (result i32)))
  (type (;2;) (func))
  (import "env" "cls" (func (;0;) (type 0)))
  (import "env" "print" (func (;1;) (type 1)))
  (import "env" "memory" (memory (;0;) 4 4))
  (func (;2;) (type 2)
    i32.const 13
    call 0
    i32.const 0
    i32.const 84
    i32.const 61
    i32.const 15
    i32.const 0
    i32.const 1
    i32.const 0
    call 1
    drop)
  (export "TIC" (func 2))
  (data (;0;) (i32.const 98304) "Hello world!\00"))
ion1 commented 3 months ago

Thank you! I got it to work.

I found two ways to add the data value as a dependency to the function which uses it:

const foo = func(/*...*/); foo.deps.push(bar);
const foo = func({ /*...*/ }, ([], [], ctx) => { ctx.deps.push(bar); });

Is one of them preferred over the other?

The ask about specifying the import names is interesting and I didn't foresee that this would be ever needed.

Note that if you're writing the entire Wasm module with wasmati, wasmati will allow you to call all importFunctions in other functions, and it will correctly resolve the import.

I thought that only this "internal consistency" is needed, so I'd be curious to learn more about the requirements of TIC-80 you talk about. How is that wasm module going to be integrated/combined with other wasm code, such that the import name makes a difference? Or maybe it's just about how the import object looks like when instantiating the wasm?

TIC-80 only ever loads a single Wasm module at a time. The named imports are just for TIC-80 to provide the RAM and a number of system functions (such as print) for the Wasm code to call.

Here's TIC-80's definition of the print function to be provided, and here's it being linked to the Wasm runtime.

PS. This is a small thing, but it would be nice to be able to do drop(call(/*...*/)); instead of call(/*...*/); drop(/*...*/);.

I'll just dump my current code in here in case someone reading this later finds it helpful. ```ts import { call, Const, data, Dependency, drop, func, i32, importFunc, importMemory, Module, } from "wasmati"; import binaryen from "binaryen"; import { writeFileSync } from "fs"; function main() { const memory = importMemory({ min: 4, max: 4 }); setImportPath(memory, "env", "memory"); const enc = new TextEncoder(); const helloWorldOffset = 0x18000; const helloWorld = data( { memory, offset: Const.i32(helloWorldOffset) }, enc.encode("Hello world!\x00") ); const endOfRamText = enc.encode("The end of RAM\x00"); const endOfRamOffset = 0x40000 - endOfRamText.length; const endOfRam = data( { memory, offset: Const.i32(endOfRamOffset) }, endOfRamText ); const cls = importFunc({ in: [i32], out: [] }, () => {}); setImportPath(cls, "env", "cls"); const print = importFunc( { in: [i32, i32, i32, i32, i32, i32, i32], out: [i32] }, () => {} ); setImportPath(print, "env", "print"); const TIC = func({ in: [], out: [], locals: [] }, ([], [], _ctx) => { call(cls, [13]); call(print, [helloWorldOffset, 84, 61, 15, 0, 1, 0]); drop(); call(print, [endOfRamOffset, 79, 68, 14, 0, 1, 0]); drop(); }); TIC.deps.push(helloWorld); TIC.deps.push(endOfRam); const module = Module({ exports: { TIC }, memory, }); const wasmUnopt = module.toBytes(); const moduleB = binaryen.readBinary(module.toBytes()); moduleB.optimize(); const wasm = moduleB.emitBinary(); console.info(`Unoptimized: ${wasmUnopt.length}`); console.info(`Optimized: ${wasm.length}`); writeFileSync("build/cart-unopt.wasm", wasmUnopt); writeFileSync("build/cart.wasm", wasm); } function setImportPath( input: Dependency.AnyImport, module: string, path: string ) { input.module = module; input.string = path; } main(); ```
mitschabaude commented 3 months ago

I found two ways to add the data value as a dependency to the function which uses it: const foo = func(/.../); foo.deps.push(bar); const foo = func({ /.../ }, ([], [], ctx) => { ctx.deps.push(bar); }); Is one of them preferred over the other?

Out of these I would clearly prefer the first one! The callback is supposed to contain just the function body.

I just want to mention though that you don't usually need explicit dependency pushing 😅 Like, when you call one function in another, it becomes a dependency automatically. Only with data it's special because the dependency on it is not encoded in any instruction (i.e., we read from memory and implicitly expect that the data is there). That's why i decided to have data declare itself as dependency of a memory if you use the data() constructor with a memory argument. This should also be sufficient in your example, so I believe you can remove TIC.deps.push()

PS. This is a small thing, but it would be nice to be able to do drop(call(/.../));

Good feedback, there are a couple of places where "stack arguments as input" is not implemented yet and it's always nice to be able to do that

ion1 commented 3 months ago

importFunc doesn't seem to take module/name parameters, and it takes a JavaScript function parameter which I can't provide.

On second thought, taking that JavaScript function might actually be really useful: I could mock the TIC-80 env functions my Wasm code uses and run a test suite outside TIC-80.

Is there a way to provide a custom env.memory to module.instantiate() so tests can inspect it? The mocked functions would also need to access it.

Perhaps it could look something like:

const mockMemory = new WebAssembly.Memory({ initial: 4, maximum: 4 });
const memory = importMemory(
  { module: "env", path: "memory", min: 4, max: 4 },
  mockMemory,
);
mitschabaude commented 3 months ago

Is there a way to provide a custom env.memory to module.instantiate() so tests can inspect it? The mocked functions would also need to access it.

Perhaps it could look something like:

That's already what importMemory does :D

mitschabaude commented 3 months ago

In general wasmati is already optimizing for the flow where you instantiate the wasm and use it directly in the same JS process. That's why all import objects have their JS import attached right away

ion1 commented 3 months ago

That's already what importMemory does :D

Ah, how silly of me. :-D As I didn't use that code path earlier, I failed to consider that it may already exist.

I see that I can let importMemory construct the WebAssembly.Memory for me and then access it through memory.value.

It would be nice if the function parameter to importFunc was optional in the same way the memory parameter to importMemory is for when I only want to generate Wasm for something else to run.