AssemblyScript / assemblyscript

A TypeScript-like language for WebAssembly.
https://www.assemblyscript.org
Apache License 2.0
16.86k stars 657 forks source link

Cyclic import dependencies #1333

Open sunfishcode opened 4 years ago

sunfishcode commented 4 years ago

Several folks having been looking at running AssemblyScript programs in Wasmtime, and while single-module programs work great, multiple-module programs, and programs which call into host APIs that need to call back into the AssemblyScript runtime, currently depend on cyclic imports, which not supported in the wasm spec.

There are two main techniques for avoiding such cycles:

One example of this is mentioned in this comment in an earlier issue.

dcodeIO commented 4 years ago

This now makes me wonder if the spec is rather missing an important edge case there, requiring such workarounds in the first place, especially since what our runtime does isn't fancy magic but comparable to exporting malloc and free from each module, allowing other modules to call these.

I remember that you mentioned that some engines can deal with this?

MaxGraey commented 4 years ago

I guess another strategy is detect all cyclic per-functions/class dependencies in module A and B and move all this cross dependent code to common module C which imported in A and B like: A -> C; B -> C. But for small modules with direct cyclic dependencies I may be better merge A & B to one module if it possible

sunfishcode commented 4 years ago

It's not an oversight in the spec; I unfortunately can't find a link to any discussions offhand, but it has been discussed. Among other things, this is why the Module Linking proposal has a section on breaking cyclic using call_import.

The Module Linking proprosal's explanation of how dynamic linking works also has a libc module which, in addition to promoting code sharing/caching for the runtime code, avoids cyclic dependencies on malloc and free.

dcodeIO commented 4 years ago

But can't engines utilize the concepts above under the hood to make things work? For instance, the DAG is known at link time with wasmtime being the linker in this case, but the AS compiler doesn't really know where a module will be located in a DAG. Pretty much analogous to node modules being standalone. In fact, without linking support in Binaryen, we can't even do any linking on our end. One could go as far as to say that the AS compiler isn't a linker at all, and the next tool up the chain should be responsible.

dcodeIO commented 4 years ago

Perhaps it makes sense to elaborate on the proposed solutions so far in perspective:

Splitting out the runtime or otherwise splitting modules would require support for shared memory and perhaps (not sure) synchronization primitives from the threads proposal, both not being widely supported at this point.

Identifying a DAG is something the AS compiler cannot do, both due to Binaryen limitations as well as conceptionally, since it emits standalone modules. Seems more appropriate on the linker level.

Converting runtime exports to function indexes and calling them indirectly can be done, but requires exporting the table for use with the loader, so we either can't utilize Binaryen's directize optimizations anymore or need to do this as a post-processing step. This looks like something an engine can do a better job at, transparently. If there's a lot of resistance to do this on the engine level, the most sensible thing we can do is make ALL runtime hooks function index exports and call them indirectly, not just some, because like mentioned above we don't know the DAG.

sunfishcode commented 4 years ago

Yeah; the call_indirect technique requires looking across the dependency graph, and it makes sense that that might be best handled in dedicated linking/bundling tools.

Would it be feasible for AssemblyScript to have an option to put its runtime library in a separate module? Naively, it seems like that should have other advantages as well: if you want to build AssemblyScript libraries that can be used within an application where the main module isn't written in AssemblyScript, having the runtime in its own module would make it straightforward for the AssemblyScript libraries to share the runtime.

dcodeIO commented 4 years ago

We have indeed considered to split out the runtime, mostly for the reason of reusing it, but hit some road blocks due to not-yet-well-supported Wasm features. My expectation is that this either will become an option eventually, or we'll be able to eliminate the runtime by switching to Wasm GC early instead. If we go that route, I don't expect a split-out runtime to be the default, however, because dealing with multiple modules will hurt developer experience due to the complexity it adds when all a user is interested in is a simple standalone module. The major concerns however are support for the necessary Wasm features like shared memory (which can't grow automatically to any size iirc for example / requires a hard maximum?) and, potentially, synchronization.