RyanLamansky / dotnet-webassembly

Create, read, modify, write and execute WebAssembly (WASM) files from .NET-based applications.
Apache License 2.0
791 stars 74 forks source link

Merger with WebAssembly.NET #3

Closed Bablakeluke closed 7 years ago

Bablakeluke commented 7 years ago

Hey Ryan - WebAssembly.NET is a mostly complete (but lots of testing is required!) implementation of what this project is attempting to achieve; it uses unmanaged memory for its heap and has all opcodes except for grow_memory (this is, in short, because its designed to trigger a range of CLR optimisations such that heap accesses end up being direct addressing to the initial block of memory - that makes it go fast but at the expense of using a specific block of memory). Interested to see if we can do something together to avoid duplication!

RyanLamansky commented 7 years ago

Hi Luke 🙂

There are now three .NET WebAssembly implementations I'm aware of (mine, yours, and cs-wasm by @jonathanvdc. And I'm amazed at the different architectures we used!

I think we can learn from each other, but I don't think our projects are good candidates for a merger: my project targets .NET Standard, which doesn't support the Save part of AssemblyBuilderAccess.RunAndSave. I'm building around AssemblyBuilderAccess.RunAndCollect, which allows in-memory use only (the loaded WASM's exports can be accessed via the C# dynamic keyword or an abstract class which it will implement. I've considered restructuring the compiler so that this choice becomes accessible to a .NET Classic build of my code, though.

Regarding the opcodes, the only challenging ones I have left are call and call_indirect, which will require me to re-work how I had planned to implement calls: I was using "this" as the instance reference, will have to move it to the last parameter and implement callable functions as static methods. I'm planning to deal with that this weekend. Everything else is either easy or done.

Not implementing grow_memory definitely allows for some interesting optimizations (such as treating the base address as a compile-time constant), and I may do something similar in the future, but my initial goal is 100% compliance with the standard. In your implementation, you could fake it by allocating max memory at the start and treating grow_memory as a no-op 🤔

I have a dozens of tests, including at least one for each implemented opcode. I expect to have between 200 and 300 by the v1.0 milestone. Maybe more if I write tests for all the potential validation errors, too.

Bablakeluke commented 7 years ago

Ah that's interesting! WebAssembly.NET is primarily targeted at Mono which supports AssemblyBuilderAccess.RunAndSave on Unix too - I'm working on a separate .NET Core project but I doubt I'll use it by default until it becomes more feature complete; very brave!

As for call and calli, the route I took fortunately involved static methods from the start - for shared global values within an assembly (such as the current module/ references to memory etc), it just generates a set of static fields too and holds them there. It's built as part of a larger Javascript engine where that same pattern occurs - i.e. lots of Javascript functions require references to the script engine they're running in but due to delegation of things like addEventListener they, in short, have to be static - so I essentially just lifted that same pattern and got a little lucky!

The biggest nightmare for me was dealing with the stack order for the set opcodes, specifically the offset immediate - the very first iteration of WebAssembly.NET didn't actually generate an AST - it was just a straight opcode to .NET opcode transpiler - as the offset needs to be added to the address, that caused a widespread rewrite. The offset immediate is the immediate that could, so if you haven't handled it just yet, watch out for that one!

My current plan with grow_memory is to essentially turn off the optimisation if the grow_memory opcode is present (with an optional warning to state that happened) - I have encountered wasm files which set the max value ridiculously high so unfortunately those min/max values are unreliable!

I need to write lots of tests at some point; My goal is to use the combination of WebAssembly.NET/ the javascript engine to run webkit.js - currently the asm to wasm toolchain itself fails with that one (it is, after all, an enormous lump of asm.js with heavy imports) - if it can run that, it can run anything!

RyanLamansky commented 7 years ago

I'm not currently generating an AST, just straight WASM-to-IL mapping. Storing instructions are indeed complicated by the offset parameter, but I'm planning to work around this by converting them to dynamically-generated function calls where the arguments can be re-ordered and constant offset added. A function call was already necessary to perform the range check + potential exception throw (doing this inline bloated the generated IL a lot), so this is a minor extension of that concept. The CLR should inline this call, given that it will be tiny and simple, so performance shouldn't be impacted...

If I can make it to v1.0 without an AST, I should be able to deliver low-latency streaming compilation soon after. With a fast CPU (or slow I/O), compilation should be done only a few milliseconds after it finishes reading the WASM. The architecture is already leaning toward this goal in significant ways, such as the function body parser being an iterator instead of a buffer.

Running webkit.js is definitely an ambitious goal! I would caution that being able to run it doesn't assure you of being able to run anything... a very large WASM can still use a comparatively small set of opcodes; the old AngryBots13.wasm only used something like 96 of the 170+ opcodes, despite being a 10 MiB Unity WebGL game, and this gap is likely to grow even larger as things like threads, garbage collection, and SIMD are added to WebAssembly.

RyanLamansky commented 7 years ago

Feel free to open a new issue if you have any other questions or comments 🙂