golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.85k stars 17.51k forks source link

cmd/compile: add go:wasmexport directive #65199

Closed johanbrandhorst closed 3 weeks ago

johanbrandhorst commented 7 months ago

Background

38248 defined a new compiler directive, go:wasmimport, for interfacing with host defined functions. This allowed calling from Go code into host functions, but it’s still not possible to call from the WebAssembly (Wasm) host into Go code.

Some applications have adopted the practice of allowing them to be extended by calling into Wasm compiled code according to some well defined ABI. Examples include Envoy, Istio, VS Code and others. Go cannot support compiling code to these applications, as the only exported function in the module compiled by Go is _start, mapping to the main function in a main package.

Despite this, some users are designing custom plugin systems using this interface, utilizing standard in and standard out for communicating with the Wasm binary. This shows a desire for exporting Go functions in the community.

There have been historical discussions on implementing this before (including #42372, #25612 and #41715), but none of them have reached a consensus on a design and implementation. In particular, #42372 had a long discussion (and design doc) that never provided a satisfying answer for how to run executed functions in the Go runtime. Instead of reviving that discussion, this proposal will attempt to build on it and answer the questions posed. This proposal supersedes #42372.

Exporting functions to the wasm host is also a necessity for a hypothetical GOOS=wasip2 targeting preview 2 of the WASI specification. This could be implemented as a special case in the compiler but since this is a feature requested by users it could reuse that functionality (similar to go:wasmimport today).

Proposal

Repurpose the -buildmode build flag value c-shared for the wasip1 port. It now signals to the compiler to replace the _start function with an _initialize function, which performs runtime and package initialization.

Add a new compiler directive, go:wasmexport, which is used to signal to the compiler that a function should be exported using a Wasm export in the resulting Wasm binary. Using the compiler directive will result in a compilation failure unless the target GOOS is wasip1.

There is a single ~optional~ required parameter to the directive, defining the name of the exported function: (UPDATE: make the parameter required, consistent with the //export pragma and easier to implement).

//go:wasmexport name

The directive is only allowed on functions, not methods.

Discussion

Parallel with -buildmode=c-shared and CGO

The proposed implementation is inspired by the implementation of C references to Go functions. When an exported function is called, a new goroutine (G) is created, which executes on a single thread (M), since Wasm is a single threaded architecture. The runtime will wake up and resume scheduling goroutines as necessary, with the exported function being one of the goroutines available for scheduling. Any other goroutines started during package initialization or left over from previous exported function executions will also be available for scheduling.

Why a "-buildmode" option?

The wasi_snapshot_preview1 documentation states that a _start function and an _initialize function are mutually exclusive. Additionally, at the end of the current _start functions as compiled by Go, proc_exit is called. At this point, the module is considered done, and cannot be interacted with. Given these conditions, we need some way for a user to declare that they want to build a binary especially for exporting one or more functions and to include the _initialize function for package and runtime initialization.

We also considered using a GOWASM option instead, but this feels wrong since that environment variable is used to specify options relating to the architecture (existing options are satconv and signext), while this export option is dependent on the behavior of the "OS" (what functions to export, what initialization pattern to expect).

What happens to func main when exports are involved?

Go code compiled to a wasip1 Wasm binary can be either a "Command", which includes the _start function, or a "Reactor/Library", which includes the _initialize function.

When using -buildmode=c-shared, the resulting Wasm binary will not contain a _start function, and will only contain the _initialize function and any exported functions. The Go main function will not be exported to the host. The user can choose to export it like any other function using the //go:wasmexport directive. The _initialize function will not automatically call main. The main function will not initialize the runtime.

When the -buildmode flag is unset, the _start function and any exported functions will be exported to the host. Using //go:wasmexport on the main function in this mode will result in a compilation error. In this mode, only _start will initialize the runtime, and so must be the first export called from the host. Any other exported functions may only be called through calling into host functions that call other exports during the execution of the _start function. Once the _start function has returned, no other exports may be called on the same instance.

Why not reuse //export?

//export is used to export Go functions to C when using buildmode=c-shared. Use of //export puts restrictions on the use of the file, namely that it cannot contain definitions, only declarations. It’s also something of an ugly duckling among compiler directives in that it doesn’t use the now established go: prefix. A new directive removes the need for users to define functions separately from the declaration, has a nice symmetry with go:wasmimport, and uses the well established go: prefix.

Handling Reentrant Calls and Panics

Reentrant calls happen when the Go application calls a host import, and that invocation calls back into an exported function. Reentrant calls are handled by creating a new goroutine. If a panic reaches the top-level of the go:wasmexport call, the program crashes because there are no mechanisms allowing the guest application to propagate the panic to the Wasm host.

Naming exports

When the name of the Go function matches that of the desired Wasm export, the name parameter can be omitted.

For example:

//go:wasmexport add
func add(x, y int) int {
    return x + y
}

Is equivalent to

//go:wasmexport
func add(x, y int) int {
    return x + y
}

The names _start and _initialize are reserved and not available for user exported functions.

Third-party libraries

Third-party libraries will need to be able to define exports, as WASI functionality such as wasi-http requires calling into exported functions, which would be provided by the third party library in a user-friendly wrapper. Any exports defined in third party libraries are compiled to exported Wasm functions.

Module names

The current Wasm architecture doesn’t define a module name of the compiled module, and this proposal does not suggest adding one. Module names are useful to namespace different compiled Wasm binaries, but it can usually be configured by the runtime or using post-processing tools on the binaries. Future proposals may suggest some way to build this into the Go build system, but this proposal suggests not naming it for simplicity.

Conflicting exports

If the compiler detects multiple exports using the same name, a compile error will occur and warn the user that multiple definitions are in conflict. This may have to happen at link time. If this happens in third-party libraries the user has no recourse but to avoid using one of the libraries.

Supported Types

The go:wasmimport directive allows the declaration of host imports by naming the module and function that the application depends on. The directive applies restrictions on the types that can be used in the function signatures, limiting to fixed-size integers and floats, and unsafe.Pointer, which allows simple mapping rules between the Go and Wasm types. The go:wasmexport directive will use the same type restrictions. Any future relaxing of this restriction will be subject to a separate proposal.

Spawning Goroutines from go:wasmexport functions

The proposal considers scenarios where the go:wasmexport call spawns new goroutines. In the absence of threading or stack switching capability in Wasm, the simplest option is to document that all goroutines still running when the invocation of the go:wasmexport function returns will be paused until the control flow re-enters the Go application.

In the future, we anticipate that Wasm will gain the ability to either spawn threads or integrate with the event loop of the host runtime (e.g., via stack-switching) to drive background goroutines to completion after the invocation of a go:wasmexport function has returned.

Blocking in go:wasmexport functions

When the goroutine running the exported function blocks for any reason, the function will yield to the Go runtime. The Go runtime will schedule other goroutines as necessary. If there are no other goroutines, the application will crash with a deadlock, as there is no way to proceed, and Wasm code cannot block.

Authors

@johanbrandhorst, @achille-roussel, @Pryz, @dgryski, @evanphx, @neelance, @mdlayher

Acknowledgements

Thanks to all participants in the go:wasmexport discussion at the Go contributor summit at GopherCon 2023, without which this proposal would not have been possible.

CC @golang/wasm @cherrymui

ydnar commented 7 months ago

Thanks for putting this together—this is exciting.

Generating a module that can act as a reactor and a command sounds like a great idea. I noticed this might conflict with how Node interprets a module. If a module exports both _start and _initialize, it will throw an exception: https://nodejs.org/api/wasi.html

  1. One could argue this is undesired behavior, and Node could change.
  2. What happens if a host detects and calls both _initialize and _start?
  3. Exporting one or the other, but not both, implies some kind of configuration or detection.
ydnar commented 7 months ago

The directive is only allowed on functions, not methods.

Using //go:wasmimport on methods has been helpful for mapping Component Model resource methods in WASI Preview 2:

From https://github.com/ydnar/wasm-tools-go/blob/9b4707e054a8b528b27240cba6c05557c4e26a53/wasi/io/error/error.wit.go:


// ToDebugString represents the method "wasi:io/error.error#to-debug-string".
//
// Returns a string that is suitable to assist humans in debugging
// this error.
//
// WARNING: The returned string should not be consumed mechanically!
// It may change across platforms, hosts, or other implementation
// details. Parsing this string is a major platform-compatibility
// hazard.
func (self Error) ToDebugString() string {
    var ret string
    self.to_debug_string(&ret)
    return ret
}

//go:wasmimport wasi:io/error@0.2.0-rc-2023-11-10 [method]error.to-debug-string
func (self Error) to_debug_string(ret *string)

Subjectively, using methods seems better aligned with the Component Model semantics than the equivalent:

//go:wasmimport wasi:io/error@0.2.0-rc-2023-11-10 [method]error.to-debug-string
func error__to_debug_string(self Error, ret *string)

Given that resources are opaque i32 handles, the same could be true for implementing exported methods via //go:wasmexport.

johanbrandhorst commented 7 months ago

Thanks for putting this together—this is exciting.

Generating a module that can act as a reactor and a command sounds like a great idea. I noticed this might conflict with how Node interprets a module. If a module exports both _start and _initialize, it will throw an exception: https://nodejs.org/api/wasi.html

1. One could argue this is undesired behavior, and Node could change.

2. What happens if a host detects and calls both _initialize and _start?

3. Exporting one or the other, but not both, implies some kind of configuration or detection.

   * In this TinyGo PR I experimented with detecting lack of main.main as the trigger for "reactor" mode with _initialize as the entry point: [runtime, builder: WebAssembly reactor mode tinygo-org/tinygo#4082](https://github.com/tinygo-org/tinygo/pull/4082)

Thank you for the information about Node's behavior here, I wasn't aware. That is certainly troubling. I will try to see what if any other precedent there is for this behavior in the ecosystem to see whether we or Node are in the wrong.

If a host calls both _initialize and _start, it will run initialization once (initialization has to be protected with something like a sync.Once to be idempotent) and then run func main(). Just calling _start will accomplish the same thing.

Indeed, if we do need some way to allow users to choose whether to build a command (executing func main()) or library (just initializating and exporting functions), this proposal would need to add some way for users to turn that knob. I don't want to prejudice that discussion until we know if we need it.

johanbrandhorst commented 7 months ago

Given that resources are opaque i32 handles, the same could be true for implementing exported methods via //go:wasmexport.

This may be true, but I still think this proposal serves as an MVP that we can enhance with method support in a subsequent proposal once the initial hurdles have been overcome.

ydnar commented 7 months ago

Thank you for the information about Node's behavior here, I wasn't aware. That is certainly troubling. I will try to see what if any other precedent there is for this behavior in the ecosystem to see whether we or Node are in the wrong.

If a host calls both _initialize and _start, it will run initialization once (initialization has to be protected with something like a sync.Once to be idempotent) and then run func main(). Just calling _start will accomplish the same thing.

Maybe it’s a bigger question about what is defined behavior. Is having both _initialize and _start valid, or undefined? Having only one entry point is less ambiguous, e.g. the host can only call one, but not both (or choose, which could be contrary to the user’s expectation).

ydnar commented 7 months ago

Indeed, if we do need some way to allow users to choose whether to build a command (executing func main()) or library (just initializating and exporting functions), this proposal would need to add some way for users to turn that knob. I don't want to prejudice that discussion until we know if we need it.

Have had previous discussions about -buildmode=wasm-reactor to mirror -buildmode=c-shared.

johanbrandhorst commented 7 months ago

I created an issue to ask the NodeJS devs for the source of this design decision: https://github.com/nodejs/node/issues/51544

cjihrig commented 7 months ago

Hey. Node developer that implemented that design decision here. 👋

That change was nearly four years ago, and I have since forgotten the exact motivation. However, I was able to dig this up: https://github.com/WebAssembly/WASI/commit/d8b286c697364d8bc4daf1820b25a9159de364a3. At that point in time, WASI commands had a _start() function, and WASI reactors had an _initialize() function. Commands and reactors were mutually exclusive.

WASI has changed a good bit since then. I no longer work on WASI, so I don't know if that design decision is still valid or not. I would recommend checking with the folks in the WASI repos.

zetaab commented 7 months ago

https://github.com/WebAssembly/wasi-http/issues/95 contains discussion to use _initialize func. So if that is not possible to golang, it would be difficult

johanbrandhorst commented 7 months ago

Any user created func init() would be run in _initialize, is this not sufficient?

johanbrandhorst commented 7 months ago

Hey. Node developer that implemented that design decision here. 👋

That change was nearly four years ago, and I have since forgotten the exact motivation. However, I was able to dig this up: WebAssembly/WASI@d8b286c. At that point in time, WASI commands had a _start() function, and WASI reactors had an _initialize() function. Commands and reactors were mutually exclusive.

WASI has changed a good bit since then. I no longer work on WASI, so I don't know if that design decision is still valid or not. I would recommend checking with the folks in the WASI repos.

Thanks so much for providing your input and this reference. It seems this doc now lives at https://github.com/WebAssembly/WASI/blob/a7be582112b35e281058f1df7d8628bb30a69c3f/legacy/application-abi.md. I wonder, given that this is now under the legacy heading, whether this statement is still true:

These kinds are mutually exclusive; implementations should report an error if asked to instantiate a module containing exports which declare it to be of multiple kinds.

If so, this design would need to change to allow the user to choose whether to compile a Command or a Library (Reactor). @sunfishcode perhaps you could provide some guidance here?

sunfishcode commented 7 months ago

The _start and _initialize functions and legacy/application-abi.md file are all Preview 1 things. Many Preview 1 Wasm engines recognize _start for commands, and some recognize _initialize as an entrypoint for reactors.

Preview 2 is based on the Wasm component model.

Edit: I was mistaken about the component-model start function. It's not permitted to call imports, so it's not usable for arbitrary initialization code. There are ongoing discussions about this.

johanbrandhorst commented 7 months ago

Thanks for the explanation. This proposal targets our existing wasm implementations, js/wasm and wasip1/wasm. We'll have a think about the best way to go about this that doesn't paint us into a corner when it comes to adding support for wasip2 down the line.

ydnar commented 7 months ago
  • If the tooling you use to go from a core-wasm module to a component supports it, the core-wasm _initialize function may be automatically wired up to the component-model start section.

What’s an example of tooling that converts a module to a component that supports the component model start section?

Wasmtime seems to not support the start section? https://github.com/bytecodealliance/wasmtime/blob/e9d580776ee27f4ed59ba334765aacbcc22fa6e4/crates/environ/src/component/translate.rs#L623

johanbrandhorst commented 7 months ago

In light of the discussion around NodeJS's behavior and the documented separation between _initialize and _start in wasip1, we've updated the proposal to include a new -buildmode=wasip1-reactor, used to instruct the compiler to produce a Wasm binary with an _initialize function in place of the _start function. The use of go:wasmexport is limited to this new build mode, which is only available for GOOS=wasip1.

cherrymui commented 7 months ago

Thanks for the proposal! Looks good overall.

-buildmode=wasip1-reactor

Is there something similar for js/wasm? Or the library/export mechanism is very different? Also, will the mechanism be similar for later wasip2, or eventual wasi? If so, maybe we can choose a more general name like wasm-library, so we don't need to have a different build mode for each of them? (For start it is okay to only implement on wasip1, just like the c-shared build mode is not implemented on all platforms.)

_initialize

Is _initialize required to be called before any exported functions can be called? Or, the first time it calls into Go _initialize is called if not already? Or the Wasm execution engine always automatically calls _initialize on module load time, so it is guaranteed to be called first?

In the absence of threading or stack switching capability in Wasm, the simplest option is to document that all goroutines still running when the invocation of the go:wasmexport function returns will be paused until the control flow re-enters the Go application.

So, this sounds like that at the end of the exported function, the Go runtime will not try to schedule other goroutines to run but directly return to Wasm? I assume this might be okay. But js.FuncOf seems to choose a different approach. This is also related to the discussion in #42372. Could you explain the reason for choosing this approach?

GODEBUG=wasmgoroutinemon=1

I'm not sure we want this debug mode. As you mentioned, it is probably not uncommon to have background goroutines. If one wants to ensure there is no goroutine at the time of exported function exiting, one probably can check it with runtime.NumGoroutine.

Thanks.

johanbrandhorst commented 7 months ago

-buildmode=wasip1-reactor

Is there something similar for js/wasm? Or the library/export mechanism is very different? Also, will the mechanism be similar for later wasip2, or eventual wasi? If so, maybe we can choose a more general name like wasm-library, so we don't need to have a different build mode for each of them? (For start it is okay to only implement on wasip1, just like the c-shared build mode is not implemented on all platforms.)

Any wasm module can declare exports, but we don't anticipate that exporting methods like this is generally useful to users of js/wasm - we have js.FuncOf today to make Go code callable from JS, and making it callable from Wasm doesn't seem nearly as useful for that platform.

For wasip2, as illustrated by Dan's reply above, it's not clear what the export mechanism would look like yet. The name wasip1-reactor is chosen to be deliberately specific to wasip1. The exact functionality in this proposal would be limited to wasip1 forever, and any hypothetical wasip2 proposal would likely have to explain how/if wasmexport will be available for that target initially.

_initialize

Is _initialize required to be called before any exported functions can be called? Or, the first time it calls into Go _initialize is called if not already? Or the Wasm execution engine always automatically calls _initialize on module load time, so it is guaranteed to be called first?

The expectation within the greater wasip1 ecosystem seems to be that if _initialize is exported by a module, it will be called before any exported methods are called. Our implementation wouldn't automatically call _initialize if it hasn't been called, it would likely just crash horribly.

In the absence of threading or stack switching capability in Wasm, the simplest option is to document that all goroutines still running when the invocation of the go:wasmexport function returns will be paused until the control flow re-enters the Go application.

So, this sounds like that at the end of the exported function, the Go runtime will not try to schedule other goroutines to run but directly return to Wasm? I assume this might be okay. But js.FuncOf seems to choose a different approach. This is also related to the discussion in #42372. Could you explain the reason for choosing this approach?

Yes, once the exported function returns, we would not schedule other available goroutines but return to the host. The reason for this is that we believe it's what users would expect to happen, since the runtime and various standard libraries maintain their own goroutines that would make it hard to predict the behavior and runtime of exported functions. If you believe that to be an incorrect assumption we're happy to reconsider this. Note that this also includes goroutines started by the exported function itself.

GODEBUG=wasmgoroutinemon=1

I'm not sure we want this debug mode. As you mentioned, it is probably not uncommon to have background goroutines. If one wants to ensure there is no goroutine at the time of exported function exiting, one probably can check it with runtime.NumGoroutine.

This is a fair point, and we could certainly slim down the proposal by removing this and consider it as a future addition. Thanks!

cherrymui commented 7 months ago

Sounds good, thanks.

I guess it might be fine to return to the host when the exported function returns. I guess one question is when the "background" goroutines run. If the exported functions get called and return, but none of them explicitly wait for the background goroutines, the background goroutines will probably never run? Would that be a problem for, say, timers?

johanbrandhorst commented 7 months ago

The background goroutines could run again if the exported function gets called again. I think ideally users who want concurrent work in exported functions would utilize something like a sync.WaitGroup to ensure work is completed during the execution of the function. A future proposal might be able to tackle this by exposing something like _gosched to run all goroutines until asleep, but this proposal does not account for such a feature. Also, since Threads is stable in Wasm, we may be able to just spawn new threads in the near future, which could execute in parallel to the exported function.

achille-roussel commented 7 months ago

The problem of having goroutines blocked after the export call returned isn't much different from what happens when invoking an import. When a WebAssembly module calls a host import, it yields control to the WebAssembly runtime; no goroutines can execute during that time.

The issue is amplified with exports because the WebAssembly runtime could keep the module paused for extended periods of time, and the expectation is that imports usually return shortly after they were invoked, but it isn't fundamentally different.

Despite the limitations, we can still deliver incremental value to Go developers by allowing them to declare exports.

inliquid commented 7 months ago

@johanbrandhorst when it comes to background goroutines, do you know if the proposed solution different from tinygo which supports exported functions?

ydnar commented 7 months ago

@johanbrandhorst when it comes to background goroutines, do you know if the proposed solution different from tinygo which supports exported functions?

We have a separate PR to TinyGo that prototypes the same model, suspending and resuming goroutines on an export call.

cherrymui commented 7 months ago

The background goroutines could run again if the exported function gets called again.

If the exported function (or another exported function) gets called again, and that function returns without explicitly synchronizing or rescheduling, the background goroutine may still not run? I think blocking for a little while is not a problem, but it might be a problem if it never get to run (while the exported function get called again and again)?

As you mentioned, once we have thread supports, it may not be a problem.

johanbrandhorst commented 7 months ago

It's true that goroutines may never get to run if there's no point in the exported function to yield to the runtime. I think that's still what I would expect to happen if I wrote my exported function this way. All alternatives would be more confusing I think (waiting before returning or maybe running the scheduler before executing the exported function).

As you say, we can hopefully improve this with threads support in the future.

cherrymui commented 7 months ago

Okay. This is probably fine. We can change it later if there is any problem. Thanks.

johanbrandhorst commented 7 months ago

I've removed the GODEBUG option, we can add that as an enhancement later and suggest users use NumGoroutines() for their debugging needs for now.

cherrymui commented 7 months ago

Thanks. Will you be working on a prototype or implementation?

When the -buildmode flag is unset, the _start function will remain, and any //go:wasmexport comments in the included files will result in a compilation failure.

Are exports useful to "commands"? Maybe it starts with the main function, calls an imported function to Wasm, which calls back into Go with an exported function? Like cgo export can be used for both libraries and executables.

If so, perhaps we can allow wasmexport in "exe" (default) build mode. And the wasip1-reactor/exe build mode only controls _initialize vs. _start.

Thanks.

johanbrandhorst commented 7 months ago

Allowing exports for Commands was tempting, but the issue is that our _start function (correctly) calls proc_exit before returning, which "terminates the program", at which point the host may do anything with the memory we've been allocated, so exports are not safe to be called. I don't know of any use cases today of Commands that also want exports, so keeping this behavior for existing Commands seems reasonable.

cherrymui commented 7 months ago

@johanbrandhorst the case I have in mind is not the main function returns, but it calls into Wasm using an imported function, which then could call back into Go using an exported function. It is similar to the cgo program below (it is an executable)

x.c

#include <stdio.h>

#include "_cgo_export.h"

void CF() {
    printf("call into C\n");
    GoF();
}

x.go

package main

// void CF();
import "C"

func main() {
    println("Go main")
    C.CF()
}

//export GoF
func GoF() {
    println("call back into Go")
}

Is there something similar for Wasm?

johanbrandhorst commented 7 months ago

I see, that's an interesting question! I went to look at the wasip1 ABI document again and it says:

_start is the default export which is called when the user doesn't select a specific function to call. Commands may also export additional functions, (similar to "multi-call" executables), which may be explicitly selected by the user to run instead. Except as noted below, commands shall not export any mutable globals, tables, or linear memories. Command instances may assume that they will be called from the environment at most once. Command instances may assume that none of their exports are accessed outside the duration of that call.

It seems that this is a use case we hadn't considered, which would indeed allow exports in normal executables too, and indeed only during the one call from the host (which may be reentrant). I think you are right that we should allow exports in this case, though there is still a question of where runtime initialization should happen. Perhaps this could be an enhancement to this functionality that we could introduce in the future instead of as part of this proposal? Nothing in this proposal excludes this functionality in the future (I think?).

Will you be working on a prototype or implementation?

We do intend to work on a prototype of this in the coming months.

cherrymui commented 7 months ago

Thanks. SGTM. Supporting exports for commands in a later step is probably fine.

I think you are right that we should allow exports in this case, though there is still a question of where runtime initialization should happen.

I think _start will initialize the runtime. This happens before the user code could call an imported Wasm function, therefore also before any exported functions could be called.

ydnar commented 7 months ago

Will main.main be called by _initialize in reactor mode?

Is main.main required to build?

Could the presence of main.main trigger command mode, and the omission trigger reactor mode?

cherrymui commented 7 months ago

For c-archive and c-shared build modes, main.main can be present, but not called by initialization code (unless the user code explicitly calls it). I'd suggest we do the same for consistency, and we set the build mode explicitly, instead of implicitly based on main.main.

ydnar commented 7 months ago

For c-archive and c-shared build modes, main.main can be present, but not called by initialization code (unless the user code explicitly calls it). I'd suggest we do the same for consistency, and we set the build mode explicitly, instead of implicitly based on main.main.

Makes sense. Should this be explicitly called out?

johanbrandhorst commented 7 months ago

In reactor mode I think we should only export main.main if the user chooses to, (i.e., using //go:wasmexport). The default behavior would be not to include it. _initialize would not call it. I've added a note in the proposal 👍🏻.

ydnar commented 7 months ago

Related discussion: https://github.com/WebAssembly/component-model/pull/297

A toolchain that converts a WebAssembly module to a component can map _initialize to a component start function, enabling initialization of the runtime prior to the host calling Component Model exports.

johanbrandhorst commented 7 months ago

Note that we'll probably have a different export solution for a hypothetical wasip2/wasm32, but it's good to know that wasip1/wasm binaries can be "forward-compatible" in this way.

rsc commented 7 months ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

cherrymui commented 7 months ago

Given the similarity between the proposed build mode and the c-archive build mode on other platforms, could we just use the c-archive build mode to mean this on Wasm? The Go module is probably called from Wasm that is compiled from C. So c-archive makes some sense. Thanks.

johanbrandhorst commented 7 months ago

Given the similarity between the proposed build mode and the c-archive build mode on other platforms, could we just use the c-archive build mode to mean this on Wasm? The Go module is probably called from Wasm that is compiled from C. So c-archive makes some sense. Thanks.

I'm hesitant to overload the meaning of the c-archive build mode in this case, for three primary reasons:

  1. I don't know that the assertion that the Go module will be called from C compiled to Wasm is true. At least one primary use case I have in mind for this is to call Go-compiled Wasm from Go programs running Wazero, as a way to provide plugins for arbitrary Go applications. This is one of the most popular use cases for wasip1 as far as I can tell (see examples in the background of this proposal). It would be a disservice to the Wasm ecosystem, which is fundamentally origin-language-agnostic, to tie this functionality to the c-archive build mode.
  2. The behavior of the Wasm exported binary is different from that of a c archive. There is no shared memory space, there are no threads, and so it might invite confusion for users what expectations can be had of the uses of the compiled binary.
  3. Using c-archive for wasip1 would set a precedent for future Wasm export implementations (e.g. a hypothetical wasip2) to continue using this build mode while they may differ significantly in behavior from that of wasip1.

I'm sympathetic to the concern of build mode bloat, especially as this build mode would not be reused for a hypothetical wasip2 port, but I do believe it to be in the best interest of the user.

ydnar commented 7 months ago

Further, a hypothetical GOOS=wasip2 would likely use something akin to -buildmode=wasm-component, which would emit a component, effectively a superset of a Wasm module.

Today, in our work to support WASI Preview 2 in TinyGo, the build process is 3-phase: 1) compile a Wasm module, 2) decorate the module with WIT metadata, and 3) convert the Wasm module to a component:

tinygo build -target=wasip2 -x -o main.wasm ./cmd/wasip2-test
wasm-tools component embed -w wasi:cli/command $(tinygo env TINYGOROOT)/lib/wasi-cli/wit/ main.wasm -o embedded.wasm
wasm-tools component new embedded.wasm -o component.wasm

Currently the second and third phases are implemented in Rust in the wasm-tools program. I suspect we’d like to implement that functionality directly in the Go toolchain so it can natively generate a component.

To color this bikeshed, I’d advocate for -buildmode=wasm-module or -buildmode=wasm-reactor, not tying it to a specific GOOS.

cherrymui commented 7 months ago

not tying it to a specific GOOS.

I'm also leaning towards this, even if we don't reuse c-archive. I think it is possible to use the same build mode on wasip1, wasip2, and possibly eventually wasi. The implementation can be slightly different. As long as they are not vastly different, it would be fine.

I don't know that the assertion that the Go module will be called from C compiled to Wasm is true.

c-archive doesn't have to be called from C. It could be called from code compiled from other languages as long as it uses C ABI.

johanbrandhorst commented 7 months ago

I'm worried that trying to name something now to reuse in future WASI ports is going to be a futile endeavor because there is still so much unknown about wasip2 and wasi and how they will relate to Go. To give some examples:

  1. A "wasm module" is defined by the Wasm spec. It has exports, funcs, even a start section, which is used to initialize the state of a module (similarly to the _initialize export in wasip1). But this is not used by wasip1 or WASI preview 2 to my knowledge.
  2. A "wasm component" is defined by the WASI preview 2 (I'm struggling to find an exact definition of "component" in this documentation). It has a different ABI from a "wasm core module" (AKA "wasm module").
  3. A "wasm reactor" is a strictly WASI preview 1 concept and the terminology has been abandoned for future WASI versions.

So which name to choose that makes sense to users now and in the future? Decisions like "should we commit to wasm modules since it's part of the code spec or wasm components since it's part of the wasip2 component model?" are things I'd rather defer until a future proposal that has to consider wasip2 in its entirety, once the dust has settled on the new ABI. This is why I think it's going to be difficult to name this anything but a very wasip1-specific name. Our first implementation of wasip2 might not even support exports.

ydnar commented 7 months ago

I'm worried that trying to name something now to reuse in future WASI ports is going to be a futile endeavor because there is still so much unknown about wasip2 and wasi and how they will relate to Go.

Here is a working implementation of WASI Preview 2 in Go: https://github.com/ydnar/wasm-tools-go/tree/main/wasi

So which name to choose that makes sense to users now and in the future? Decisions like "should we commit to wasm modules since it's part of the code spec or wasm components since it's part of the wasip2 component model?" are things I'd rather defer until a future proposal that has to consider wasip2 in its entirety, once the dust has settled on the new ABI. This is why I think it's going to be difficult to name this anything but a very wasip1-specific name. Our first implementation of wasip2 might not even support exports.

In a sense, a Wasm component is just a "reactor" that conforms to a specific export contract. The wasi:cli/command world exports a single function run that could call main.main using the exact go:wasmexport machinery proposed here.

Currently the wasm-tools toolchain converts a Wasm module with imports and exports conforming to a specific contract into a component. While I don’t think it’s ideal in the long term for Go to depend on a third-party tool to emit a valid WASI Preview 2 program, it’s a bridge that works today.

ianlancetaylor commented 7 months ago

Build modes are scattered throughout all the tools, I'm slightly reluctant to define a new WASM-specific build mode. Especially given that I don't understand how it would differ from c-archive.

johanbrandhorst commented 7 months ago

If we were to reuse c-archive, would we also add a build tag that is set when this build mode is set? I think it would be important to let users write Wasm libraries that can both be used by normal programs running main (AKA "Commands") and by programs exporting specific functions to the host (AKA "Reactors"). Since the proposal suggests causing compilation errors when //go:wasmexport is used when compiling "Commands" (since we do not export functions to the host in this mode), we'd need some way to exclude files defining these exports for library authors, easiest of which would be a build tag. But is introducing a build tag for an existing build mode going to cause any problems?

ianlancetaylor commented 7 months ago

I don't see any major difficulty to adding a build tag if necessary.

ydnar commented 7 months ago

Given Ian’s comments, maybe it’s worth exploring a mechanism other than build mode.

@johanbrandhorst: For GOOS=wasip1, setting aside the presence of any user //go:wasmexport directives…is the difference between command and reactor mode simply whether the program exports _start (initializes runtime, calls main.main) vs _initialize (which initializes runtime, but does not call main.main)?

johanbrandhorst commented 7 months ago

Essentially, yes. I'm happy to consider other mechanisms, but I do want it to be explicit, and there is something to be said for the parallel to the existing build mode c-shared.

achille-roussel commented 7 months ago

Would we use c-archive or c-shared for the name? c-archive was first suggested, but the proposal and Johan's last message mentioned c-shared.

C archives are usually used during compilation. WASM modules are closer to C shared libraries in concept since they are loaded and linked at runtime.

The c-archive and c-shared build modes also use //export directives to locate the symbols exported in the build artifact. We would also update the build mode documentation to mention that either //export or //go:wasmexport is used depending on the target architecture.