golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.89k stars 17.65k forks source link

cmd/compile: relax wasm/wasm32 function import signature type constraints #66984

Open johanbrandhorst opened 6 months ago

johanbrandhorst commented 6 months ago

Background

59149 removed the package restrictions on the use of go:wasmimport, but established strict constraints on the types that can be used as input and result parameters. The motivation for this was that supporting rich types between the host and the client would require sophisticated and expensive runtime type conversions because of the mismatch between the 64 bit architecture of the client and the 32 bit architecture of the host.

With the upcoming 32 bit wasm port, this problem goes away, as both client and host will use 32 bit pointers. However, we can also support a limited set of types on 64 bit platforms, where no runtime conversion is necessary.

Proposal

Relax the constraints on types that can be used as input and result parameters with the go:wasmimport compiler directive. The exact allowed types would depend on whether wasm or wasm32 is used.

We define the "small integer types" group as the set of types described by [u]int[8|16]. The following types would be allowed as input parameters to any wasmimport/wasmexport function:

All input parameter types except string are also allowed as result parameter types.

The following types would remain disallowed as input and output parameter types:

The conventions established for use of pointers in CGO will be required when using pointers with wasmimport/wasmexport, e.g. the host can read Go memory, can write pointerless data (like the content of a byte buffer) but cannot write Go pointers to Go memory, and cannot hold on to Go pointers unless they are pinned.

Discussion

Compatibility guarantees

The Go spec does not specify the struct layout and leaves it up to implementations to decide. As such, we cannot provide a guaranteed ABI without having to change the spec or force future layout changes to provide runtime conversion of data. This proposal suggests making it clear to users through documentation that there are no guarantees of compatibility across versions of the Go compiler.

Type conversion rules

The following conversion rules would be automatically applied by the compiler for the respective parameter type:

Go Type Parameter type (per Wasm spec)
bool i32
int32, uint32

int64, uint64

i32, i32

i64, i64

float32, float64 f32, f64
string Assigned to two call parameters as a (i32, i32) tuple of (pointer, len). Only allowed for input parameters.
uintptr, unsafe.Pointer, *T, *struct, *[...]T i32, i32, i32, i32, i32

Strings

Strings are not allowed as result parameters as Wasm practically does not allow more than 1 result parameter.

Supporting GOARCH=wasm

The wasm architecture uses 64 bit pointers and integer sizes. As the host uses 32 bit pointers, this makes it impossible to allow certain types without costly runtime conversions, such as *struct types containing pointer fields. Since string types are also pointer types, *struct types containing string fields are also disallowed.

Supporting [u]int, [u]int8, [u]int16 as concrete parameters

The [u]int types are problematic as the size of them are not precisely defined, and may cause confusion when used with strictly 32 bit or 64 bit integers. The [u]int8 and [u]int16 types are problematic because we would be forced to automatically convert them to/from the i32 wasm representation, with potential loss of precision or overflow. They are still allowed as pointer type, array elements and struct fields.

Supporting slices, maps

Both slices and maps are disallowed because of the uncertainty around the memory underlying these types and interactions with struct and array rules. Users who wish to use slices can manually use (&slice, len(slice)) or unsafe.Pointer. There is no clear way to support passing or returning map data from the host other than by using unsafe.Pointer and making assumptions about the underlying data.

Related proposals

struct.Hostlayout

66408 proposes a way for users to request that struct layout is host compatible. Our proposal depends on the definitions put forward in this proposal for struct parameters.

Future work

WASI Preview 2 (AKA WASI 0.2)

WASI Preview 2 defines its API in terms of the Component Model, with a rich type system and an IDL language, WIT. The Component Model also defines a Canonical ABI with a specification for lifting and lowering Component Model types into and out of linear memory. This proposal does not attempt to define the ABI for any hypothetical wasip2 target, and would leave such decisions for any future wasip2 proposal.

Supporting struct and [...]T by value

A previous version of this proposal included support for passing struct and [...]T types by value by expanding each field recursively into call parameters. This was removed in favor of a simpler initial implementation but could be re-added if users require it.

Contributors

@johanbrandhorst, @evanphx, @achille-roussel, @dgryski, @ydnar

CC @cherrymui @golang/wasm

dr2chase commented 6 months ago

"This follows the C struct value semantics" is just a hair vague; are 8-byte quantities (float64, int64, uint64) stored at a 4-byte or 8-byte alignment? It was my understanding (and the purpose of #66408) to specify a 4-byte alignment for fields of those types when they occur in structs passed to wasm32 (tagged structs.HostLayout).

(edited to note error, the host alignment for 8-byte integers and floats is 8 bytes).

ydnar commented 6 months ago

Ideally 8-byte values would always be 8-byte aligned in the wasm32 port.

evanphx commented 6 months ago

@dr2chase Looking at what clang does, it uses 8-byte alignment on 64bit quantities so we'd match that.

dr2chase commented 6 months ago

You are right, I got it backwards. But that is what you are expecting for anything that has pointers-to-it passed to the wasm host platform, yes?

cherrymui commented 6 months ago

Thanks for the proposal! A few questions:

Besides, for structs, arrays of structs, and pointer to structs, I would suggest we allow only structs with structs.HostLayout to be passed. The reason is that in the Go spec we don't require struct fields to be laid out in memory in source order, and it may well change in a future Go release. structs.HostLayout specifies a fixed layout. Structs without that marker can change. This gives a clear way to say which structs should have a fixed layout, which are okay to change.

Thanks.

dr2chase commented 6 months ago

Two other questions, first:

type w32thing struct {
    _ structs.HostLayout
    a uint8
    b uint16
}

Is this laid out a_bb or is it aaaabbbb? What sizes do I use for struct fields? I assume it is the smaller ones, but I wanted to verify this else it would be a problem.

Second, passing pointers to 8-byte primitive types to the host will be tricky unless those references come from fields in structures tagged with HostLayout -- otherwise, they may not be aligned. So


type wx struct {
   _ structs.HostLayout
  x int64
}
func f(x int64, w wx) {
  someWasmFunc(&x) // might not work, x might not be 8-byte aligned
  someWasmFunc(&w.x) // this will work because w is a wx and its x field is 8-byte aligned
  someOtherWasmFunc(&w) // if it used *wx for its parameter type instead of *int64
}
johanbrandhorst commented 6 months ago

Thanks for the quick feedback! I've tried to answer each question:

structs and arrays. What is the ABI specification exactly? The C ABI on, say ELF AMD64, is pretty complex for passing structs and arrays. Small fields may be packed into one word. Large structs may be passed indirectly (stored on stack, passing a pointer to the callee). Do we have a specification for this?

The specification falls out of the table of transformations (I think?). There current plan isn't to introduce any sort of magic around large structs or field packing. Structs fields are added as call parameters, from the first field to the last, according to the conversion rules for the type of the field. Examples:

type foo struct {
    a int
    b string
    c [2]float32
}

With a function signature of

//go:wasmimport some_module some_function
func wasmFunc(in foo) int

Would roughly translate to (in WAT format)

// $a is of type `i32` holding the value of `a`
// $b_addr is of type `i32` and is a pointer to the start of the bytes for the Go string `b`
// $b_len is of type `i32` and is the length in bytes to read from `$b_addr` to get the whole string
// $c_0 is of type `f32` and is the value of `c[0]`
// $c_1 is of type `f32` and is the value of `c[1]`
call $some_function (local.get $a) (local.get $b_addr) (local.get $b_len) (local.get $c_0) (local.get $c_1)

Struct fields would be expanded into call parameters before subsequent fields at the same level.

What does a string look like on Wasm/WASI side?

For wasip1, we will treat Go string parameters simply as a (*byte, int) tuple. There will be no encoding constraints, just as with regular Go strings. To the Wasm host, it will look identical to using struct { a *byte; b int } as a parameter. For wasip2, those constraints would have to be considered in a hypothetical future wasip2 proposal.

Making structs.HostLayout required for structs, arrays of structs and pointers to structs

This sounds like a great idea, and we should also extend it to pointers to 8 byte sized primitive types to guarantee alignment, as suggested by @dr2chase's last question. This would avoid any question around alignment issues for pointers. It hurts the ergonomics a little bit but that's a price worth paying, I think.

type w32thing struct { _ structs.HostLayout a uint8 b uint16 }

Is this laid out a_bb or is it aaaabbbb? What sizes do I use for struct fields? I assume it is the smaller ones, but I wanted to verify this else it would be a problem.

I'm a little confused by the question to be honest. If this type was used as an input to a Wasm call, it would look like this:

// $a is of type `i32`
// $b is of type `i32`
call $some_function (local.get $a) (local.get $b)

I suppose that might mean the memory looks like this: a___bb__? We're not passing a pointer to the struct or the fields, so we'd need to copy the values into locals, which will be of type i32 (I think)? Admittedly my grasp of this exact part of the code is a bit weak so I appreciate corrections.

cherrymui commented 6 months ago

Thanks for the response!

Structs fields are added as call parameters, from the first field to the last, according to the conversion rules for the type of the field.

This sounds like a reasonable choice. Is this ABI specified anywhere in Wasm/WASI docs? Or the Wasm side has to define the function taking parameters element-wise?

For wasip1, we will treat Go string parameters simply as a (byte, int) tuple. There will be no encoding constraints, just as with regular Go strings. To the Wasm host, it will look identical to using struct { a byte; b int } as a parameter.

This sounds reasonable as well. Is it specified anywhere in Wasm/WASI docs?

Thanks.

johanbrandhorst commented 6 months ago

This sounds like a reasonable choice. Is this ABI specified anywhere in Wasm/WASI docs? Or the Wasm side has to define the function taking parameters element-wise?

I don't know about this being an official ABI so much as just a consequence of the Wasm spec around function calls and how we can apply Go semantics to it. We're limited to the i32, i64, f32 and f64 value types, and the call instruction takes a function index and arguments from the stack. In order to simulate pass-by-value for structs, we have to flatten each field to one of the allowed value types.

This sounds reasonable as well. Is it specified anywhere in Wasm/WASI docs?

Not sure there's a doc anywhere, but practically, definitions like path_create_directory, which take a string parameter, use this pattern: https://cs.opensource.google/go/go/+/refs/tags/go1.22.2:src/syscall/fs_wasip1.go;l=230.

dr2chase commented 6 months ago

I guess my question is whether a pointer-to-struct is ever passed from Go to the WASM platform, and therefore, what expectations the WASM side has about the layout of the fields of that structure. structs.HostLayout is intended to obtain those expectations, but (1) do we even need to do this? We thought we did, and (2) we need to know what the expectations are. I think it was just that 64-bit floats and ints get 64-bit alignment.

I don't think this is for specifying the layout that gets passed to a WASM call if the struct is passed by value.

cherrymui commented 6 months ago

I don't know about this being an official ABI so much as just a consequence of the Wasm spec around function calls and how we can apply Go semantics to it. We're limited to the i32, i64, f32 and f64 value types, and the call instruction takes a function index and arguments from the stack. In order to simulate pass-by-value for structs, we have to flatten each field to one of the allowed value types.

As the ABI doesn't have a way to pass struct by value, do we need to support it? If users on the other (non-Go) side have to define the function as taking arguments element-wise with primitive types and pointers, it would probably be better to define the same way on the Go side. Does any other language have a Wasm/WASI interface that allows passing struct by value?

(Same applies for arrays. Pointer to struct/array is fine.)

johanbrandhorst commented 6 months ago

I guess my question is whether a pointer-to-struct is ever passed from Go to the WASM platform, and therefore, what expectations the WASM side has about the layout of the fields of that structure.

I think the biggest concern around this is that all 64 bit values use 8 byte alignment, as you say. We definitely want this, so I think that on its own makes the case for structs.HostLayout. For other values, I think we want to just use "natural alignment" (4 byte for 4 byte values, etc). As far as we know, there is no strict enforcement of this in Wasm generally, but this is the approach taken by LLVM, so it probably makes sense for us to keep it the same.

I also don't know that it's an important question for this proposal in particular, since the answer is pretty clear regarding what we should be passing in the call instruction when encountering a pointer (an i32). I'm happy to weigh in on #66408 if needed to have this discussion though.

As the ABI doesn't have a way to pass struct by value, do we need to support it?

It's fair to say that we can just not support structs and arrays by value, their use are likely to be limited (why not use a pointer?), and it would significantly simplify the implementation. We can come back to it if we need to later. I'll update the proposal.

aykevl commented 5 months ago

On the TinyGo side we're working on an implementation of this proposal, so here's my perspective on it from TinyGo:

Question: what fields would be allowed in these structs? I would assume a struct with a chan field would be disallowed, for example. This isn't part of the proposal yet though, so perhaps this can be added? Something like this:

Structs may not be passed by value, but pointers to structs are allowed. Every field in a struct must be one of the allowed parameter types, or be a struct (recursively).

cherrymui commented 5 months ago

TinyGo has always had a 32-bit wasm implementation (int, uintptr and pointers are 32-bit). Therefore, it would make sense to allow these values at all times.

I think this is fine. And we should allow them in Go gc toolchain for the "wasm32" port.

Structs may not be passed by value, but pointers to structs are allowed. Every field in a struct must be one of the allowed parameter types, or be a struct (recursively).

Yeah, something along this line makes sense. And also for arrays. I'd say a struct field or a struct pointed by a field should also have the HostLayout marker (because the marker is not recursive).

johanbrandhorst commented 5 months ago

Thanks for your thoughts Ayke, it's always appreciated.

TinyGo has always had a 32-bit wasm implementation (int, uintptr and pointers are 32-bit). Therefore, it would make sense to allow these values at all times. That's a possible compatibility concern, but in essence we're already incompatible so I'm not sure how much of an issue this is. Thoughts?

As Cherry says, these values will be allowed since this proposal is restricted to the wasm32 architecture. The wasm architecture will not have these new relaxations applied. I'm not sure I understand the incompatibility?

Structs may not be passed by value, but pointers to structs are allowed. Every field in a struct must be one of the allowed parameter types, or be a struct (recursively).

I'd say a struct field or a struct pointed by a field should also have the HostLayout marker (because the marker is not recursive).

I've added some clarifying words to the proposal, please take a look!

aykevl commented 5 months ago

The updated proposal looks good to me! (If I'm very pedantic, it doesn't explicitly say that a struct in a struct is allowed, though it clearly should be. Right now it says *struct is allowed but struct isn't).

However, I have to say that @ydnar has pointed out that the Canonical ABI also allows structs, and it would be nice to have them supported in //go:wasmimport. That said, if I'm reading the specs correctly, the Canonical ABI and the C ABI are incompatible when it comes to structs: the C ABI passes structs by value only when it contains only one field after flattening, while the Canonical ABI passes records (similar to structs) by value if the number of fields is 16 or less after flattening. So that means //go:wasmimport would have to choose between the C ABI and the Canonical ABI.

As Cherry says, these values will be allowed since this proposal is restricted to the wasm32 architecture. The wasm architecture will not have these new relaxations applied. I'm not sure I understand the incompatibility?

Nevermind, TinyGo doesn't even support GOOS=js GOARCH=wasm tinygo ..., it just uses tinygo -target=wasm. So in essence tinygo -target=wasm ... is equivalent to GOOS=js GOARCH=wasm32 go .... Basically it has always been a GOOS=wasm32 implementation and never supported what would be GOOS=wasm (with 64-bit pointers).

I'd say a struct field or a struct pointed by a field should also have the HostLayout marker (because the marker is not recursive).

Seems like a good idea. It's easier to remove such a restriction in the future (if it turns out to be unnecessary) than it is to introduce it later. But I don't know Go internals well enough to say it is needed.

johanbrandhorst commented 5 months ago

To comment quickly on the Canonical ABI: it doesn't relate to this proposal directly as this proposal only targets the wasip1 port, and the Canonical ABI is a preview 2 document (as far as I know). A hypothetical wasip2 proposal would have to tackle type constraints for go:wasmimport (and go:wasmexport) as they relate to the Canonical ABI.

cherrymui commented 5 months ago

I'm okay with supprting passing structs by value if there is a widely used ABI that is not too complex (if it is as complex as the ELF C ABI on amd64, I'm not sure). If currently there is no widely agreed ABI for structs, we can wait. We can always add things later.

ydnar commented 5 months ago

Hi, original author of the relaxed type constraints proposed here.

The TinyGo PR where this originated depends on LLVM to flatten structs and arrays. This works in practice most of the time, except when it doesn't: namely the Component Model and WASI 0.2 extensively uses tagged unions (variant types in WIT).

The code generator (wit-bindgen-go) implements the flattening rules as specified in the Canonical ABI, which then leans on LLVM to flatten the Go structs that represent variant types.

The CABI flattening rules are per-field, so if a variant has a case that includes a 64-bit wide field, then the flattened representation of the variant must use an i64 at that position.

Given that the compiler is ignorant of the CABI layout, this strategy cannot correctly represent these variant types when passed by value.

@cherrymui: LLVM does correctly flatten structs and arrays consistent with the CABI spec (my sense is the former informed the latter). If we want to start with a more constrained set of types now and relax later, we can make that work.

aykevl commented 5 months ago

LLVM does correctly flatten structs and arrays consistent with the CABI spec (my sense is the former informed the latter).

Not exactly. If you pass a LLVM struct like {i32, i32} by value, LLVM will happily flatten the struct and pass it as values. But if you do that in C, Clang will pass the struct by reference, not by value: it will reserve some space on the stack and pass a pointer instead. See: https://godbolt.org/z/YjKj5o3c4

I believe this is why the Component Model lowers everything to bare i32/i64/f32/f64/pointer values in function signatures, which have no ambiguity in what ABI they should have on the WebAssembly level.

ydnar commented 4 months ago

With the exception of the check for structs.HostLayout, I’ve implemented the rules as defined in this proposal here in TinyGo: https://github.com/dgryski/tinygo/pull/16/commits/6921ce6520f8dc6247ccccfa9d094820df33174b

Initial (shallow) tests against WASI 0.2 APIs seem to work.

johanbrandhorst commented 4 months ago

I've updated the proposal to explicitly state that [...]T are allowed as struct fields when a pointer of that type is used as an input parameter.

ydnar commented 4 months ago

Given that unsafe.Pointer is allowed as a result type, and *T is allowed as a parameter type, I think it's reasonable to expect pointers to Go-managed memory to pass between guest and host, particularly with the addition of go:wasmexport.

Given this, I propose relaxing the result types to allow *T as well.

This would result in more type-safe interfaces. It would also be more or less symmetric with parameter types, with the exception that string would not be permitted as a return type, as it decomposes into a pair of uintptr.

johanbrandhorst commented 4 months ago

Given that unsafe.Pointer is allowed as a result type, and *T is allowed as a parameter type, I think it's reasonable to expect pointers to Go-managed memory to pass between guest and host, particularly with the addition of go:wasmexport.

Given this, I propose relaxing the result types to allow *T as well.

To paraphrase some discussions we've had on this, there are some outstanding questions before I think we can make this change, namely:

  1. Allowing *T result parameters for go:wasmimport functions would make it (probably?) impossible for the GC to track the underlying memory without making assumptions about how it was created. E.g. even if we could assume that any *T returned are allocated by a function exported by Go for the purposes of allocating memory within its memory space, how do we map that allocation (probably a []byte) to the *T, such that the GC can track the lifetime of the *T and free the memory as appropriate? We either end up freeing the []byte before we're done with the *T or end up tracking the []byte as in use forever while freeing the *T when it's done.
  2. Even allowing *T result parameters for go:wasmexport functions might not be safe, as the GC may assume that a piece of memory is no longer in use once we have returned from the function execution, though it probably shouldn't run until another export is called, so it may be safe until another export is called.

My primary motivation behind this change is to allow better ergonomics when using Wasm without compromising on safety. If the argument for allowing *T is that unsafe.Pointer is allowed, I think there is clearly a difference in user expectations when using *T and using unsafe.Pointer, and it's important that we only allow *T if we can do so safely. Unless we are confident we can allow using *T as a result parameter safely, including if we have to restrict it to be used with wasimport or wasmexport only, I think we should wait to include it.

johanbrandhorst commented 3 months ago

Are there any outstanding concerns for this proposal?

rsc commented 2 months ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

cherrymui commented 2 months ago

I think the updated proposal is pretty reasonable.

Type passed to host | Type read from host

What exactly do they mean? One for wasmimport, one for wasmexport? One for parameter, one for result? It would be good to specify more clearly. I also don't understand why they are not the same. E.g. string is allowed as a "Type passed to host", but not a "Type read from host". Would it mean that the host can only read a string, but cannot construct a string?

For passing pointers between the Go Wasm module and host w.r.t the GC, it is similar to passing pointers in cgo. Cgo has pointer passing rules https://pkg.go.dev/cmd/cgo#hdr-Passing_pointers . Basically

Go code may pass a Go pointer to C provided the memory to which it points does not contain any Go pointers to memory that is unpinned. ...

C code may keep a copy of a Go pointer only as long as the memory it points to is pinned.

Maybe we want to have similar rules?

Also, if we allow unsafe.Pointer which is allowed to point to Go memory, it is not that different from allowing a *T. So maybe we want to have the same allowance (and restriction) for unsafe.Pointer?

GOARCH=wasm

The major difference between wasm and wasm32 is the size of int and pointers. It is probably reasonable to not allow types containing them. Smaller scalers (bool, uint8, etc.) are the same on both, so we probably could allow them on GOARCH=wasm (translate to Wasm i32 the same way).

ydnar commented 2 months ago

The major difference between wasm and wasm32 is the size of int and pointers. It is probably reasonable to not allow types containing them. Smaller scalers (bool, uint8, etc.) are the same on both, so we probably could allow them on GOARCH=wasm (translate to Wasm i32 the same way).

If one authors a wasm function with an int (or a better example uintptr), I expect they know what they're doing (e.g. somewhat similar argument for supporting *T).

Perhaps omit int/uint and permit uintptr?

johanbrandhorst commented 2 months ago

I think the updated proposal is pretty reasonable.

Type passed to host | Type read from host

What exactly do they mean? One for wasmimport, one for wasmexport? One for parameter, one for result? It would be good to specify more clearly. I also don't understand why they are not the same. E.g. string is allowed as a "Type passed to host", but not a "Type read from host". Would it mean that the host can only read a string, but cannot construct a string?

Great question. "Type passed to host" means passed from the Go memory space to the host memory space, and vice versa for "Type read from host". The asymmetry arises from the desire to have the Go GC know about and own all memory pointed to. Passing a *T into a wasmexport function, or returning a *T from a wasmimport function would mean the Go GC has a pointer value without knowing where the original memory was allocated. We did consider having some sort of automated mapping function for a potential exported alloc that the host could call to allocate memory, but it's not clear to me how we could map that to values returned from the host. I'm also not familiar with the exact rules for this in CGO, but when writing this proposal we assumed that uncontrolled memory would be undesirable. I'm happy to relax that constraint if we can make it work.

Strings are unsupported as wasmimport result parameters because we only allow a single result parameter, and strings are synthesized using a pointer and a len tuple. Conversely, they are disallowed as wasmexport input parameters because it would mean passing host allocated memory into Go.

I've renamed the table headings to "Export result/Import parameter" and "Export parameter/Import result". Does that make it clearer?

For passing pointers between the Go Wasm module and host w.r.t the GC, it is similar to passing pointers in cgo. Cgo has pointer passing rules https://pkg.go.dev/cmd/cgo#hdr-Passing_pointers . Basically

Go code may pass a Go pointer to C provided the memory to which it points does not contain any Go pointers to memory that is unpinned. ... C code may keep a copy of a Go pointer only as long as the memory it points to is pinned.

Maybe we want to have similar rules?

Yes, the rules regarding pinning look great. I read through that whole article and still couldn't quite make sense of what it means for C allocated pointers passed to Go functions (analogous to host allocated pointers passed to wasmexport functions) or C allocated pointers returned from C functions (analogous to host allocated pointers return from wasmimport functions). If we can make it work for C we should be able to make it work the same for wasmimport and wasmexport though, I think.

Also, if we allow unsafe.Pointer which is allowed to point to Go memory, it is not that different from allowing a *T. So maybe we want to have the same allowance (and restriction) for unsafe.Pointer?

To me, there's a great difference between these two. One of them has the user deliberately opting in to manual memory management, while the other looks like real, safe Go code. If we can't provide the usual memory guarantees (and I don't see how we can do that with host allocated memory), we shouldn't allow *T, in my opinion.

GOARCH=wasm

The major difference between wasm and wasm32 is the size of int and pointers. It is probably reasonable to not allow types containing them. Smaller scalers (bool, uint8, etc.) are the same on both, so we probably could allow them on GOARCH=wasm (translate to Wasm i32 the same way).

Perhaps omit int/uint and permit uintptr?

This proposal is limited to the wasm32 architecture. There's no ambiguity in integer and pointer sizes, so we shouldn't need to disallow int or uint. uintptr is allowed both as input and result parameter.

We could expand this proposal to include a type relaxation for the wasm architecture, but I think that'd be better served in a separate proposal, and TBH I don't think we want to encourage people to use wasip1/wasm once wasip1/wasm32 is available, so I'm OK with the asymmetry in ergonomics.

cherrymui commented 2 months ago

Thanks for the reply!

I'm not really sure we need to distinguish "to host" and "from host". With both wasmimport and wasmexport, one can e.g. pass a Go pointer from an argument of wasmimport back to Go as an argument of wasmexport. I think that should be fine. Of course, some types are just impossible, like string as a result (for both wasmimport and wasmexport).

For the same reason, passing a *T as an argument to wasmexport seems okay if it is from an argument of wasmimport.

Of course, there is still an issue about memory safety -- what the host can do for Go managed memory. I think this is very similar to cgo, and we probably want to apply similar restrictions, e.g. the host can read Go memory, can write pointerless data (like the content of a byte buffer) but cannot write Go pointers to Go memory, and cannot hold on to Go pointers unless they are pinned. Even with the distinction between "to host" and "from host", I think we still need the restrictions, e.g. it is allowed to pass a Go pointer as an argument of wasmimport to the host, but the host should not hold on to it after the call has returned.

johanbrandhorst commented 2 months ago

That sounds good to me, the rules around CGO have worked fine for CGO so we can relax some static guarantees with the use of convention. I've updated the proposal to remove the distinction between passing to the host and reading from the host, and also added a section to the proposal about relying on the CGO conventions for safe use.

cherrymui commented 1 month ago

The updated proposal looks good. Thanks!

For consistency, it may still be reasonable to apply similar relaxations on GOARCH=wasm. wasip1/wasm is probably not going to go away very soon (especially given that wasm32 isn't available yet).

johanbrandhorst commented 1 month ago

Thanks for the thorough summary, I agree that this would clearly provide value to users. I've updated the proposal to detail the aim to support both wasm and wasm32. For now, we allow precise primitive types (excluding int and uint), and *struct not containing pointer or string fields. Just as on wasm32, struct types and any struct fields must recursively embed HostLayout.

rsc commented 1 month ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

The proposal is https://github.com/golang/go/issues/66984#issue-2257908454

aclements commented 1 month ago

I think this is looking really good. A few thoughts:

ydnar commented 1 month ago

I think this is looking really good. A few thoughts:

  • I'm slightly concerned about allowing int and uint. The argument is that on wasm32, they're 32 bits in Go, so can be safely passed as i32, but the size of int and uint aren't specified by the language (the spec only says that they can be 32 or 64 bits). Of course, the platform is what defines the size of these, so I could see an argument that wasm32 defines them to be 32 bits and thus interchangeable as an i32. But by the logic of that argument, wasm should define them to be 64 bits and thus interchangeable as an i64.

If developer chooses to use int, uint or uintptr in a wasmimport or wasmexport signature, I think we can assume they know what they’re doing, and expect a 32-bit or 64-bit representation depending on GOARCH.

Edit: this proposal constraints int and uint to GOARCH=wasm32, so until there’s wasm64, I think this is moot?

  • What happens if the user passes a *struct containing a smaller integer type like uint8? If uint8 is passed as a parameter, the proposal defines that it gets expanded to an i32, but we can't change the size of the field in the struct. Is wasm prepared to receive a pointer to a struct containing a field smaller than 4 bytes?

The struct isn’t directly passed over the ABI, so it doesn’t matter. That’s where structs.HostLayout comes in. Some context: https://github.com/golang/go/issues/63131#issuecomment-1925945414

aclements commented 1 month ago

Edit: this proposal constraints int and uint to GOARCH=wasm32, so until there’s wasm64, I think this is moot?

Since I'm not deep in wasm, it took me a bit to understand your argument here. I think the key context is that we've defined GOARCH=wasm to be "32-bit wasm external interfaces, but it looks like a 64-bit CPU to Go programs" and int and uint are a place where these conflict, so we define that they cannot be passed on GOARCH=wasm. Is that right?

The struct isn’t directly passed over the ABI, so it doesn’t matter. That’s where structs.HostLayout comes in. Some context: https://github.com/golang/go/issues/63131#issuecomment-1925945414

Thanks. The struct layout is still part of the ABI, but you're right that HostLayout will tell Go to use whatever the wasip2 ABI for struct layout is, including for these smaller types (which happens to align with Go's current layout rules except on 64-bit types).

johanbrandhorst commented 1 month ago

A bit late to the latest discussion, but I'm happy to restrict the use of int and uint on both GOARCH=wasm and GOARCH=wasm32 to avoid any ambiguity. We've already disallowed it on GOARCH=wasm. I've updated the proposal.

johanbrandhorst commented 1 month ago

I missed the question about [u]int8 and [u]int16, I think that's also fair (disallowing them to avoid confusion). We can come back to these if the prove to be useful to the community.

ydnar commented 1 month ago

Removing [u]uint8 and [u]int16 may have had the unintended side-effect of disallowing a common use case: passing *byte to a wasmimport call.

The constraints for *T limit T to allowed types, or structs that contain only allowed types. By removing [u]int8 and [u]int16 from the allowed type list, *T where T is [u]int8 or [u]int16 is no longer allowed. Nor would structs containing these types. Is that our intention here?

Also, how should the proposal deal with string (a tuple of *uint8 and uintptr)?

johanbrandhorst commented 1 month ago

Great points Randy, I've updated the proposal to allow [u]int[8|16] as pointer types and struct fields. That will avoid the implicit conversion necessary to support these types as concrete parameters while allowing their use for structs and pointers.

aclements commented 1 month ago

I think *[...]T where T is an allowed type. should also allow [u]int{8,16}. It might be worth just giving a name to the set of types that are allowed via indirection.

johanbrandhorst commented 1 month ago

Thanks, I defined the small integer types group and made it allowed for array elements.

ydnar commented 1 month ago

Can we consider limiting the structs.Hostlayout requirement to structs with > 1 field? This would allow:

johanbrandhorst commented 1 month ago

I don't see a problem with allowing *struct{} without embedding structs.HostLayout. Not sure about structs with a single field. I've updated the proposal to allow *struct{} without the embedding.

aclements commented 1 month ago

Thanks for the updates!

One nit, in "Strings are not allowed as result parameters as Wasm practically does not allow more than 1 result parameter." the link points to HEAD, which has drifted. It looks like you meant to link to https://go.googlesource.com/go/+/refs/tags/go1.23.0/src/cmd/internal/obj/wasm/wasmobj.go#219.

johanbrandhorst commented 1 month ago

Thanks for the updates!

One nit, in "Strings are not allowed as result parameters as Wasm practically does not allow more than 1 result parameter." the link points to HEAD, which has drifted. It looks like you meant to link to https://go.googlesource.com/go/+/refs/tags/go1.23.0/src/cmd/internal/obj/wasm/wasmobj.go#219.

Thanks for the thorough check, updated 😄

aclements commented 1 month ago

The proposal is https://github.com/golang/go/issues/66984#issue-2257908454. Have all remaining concerns about this proposal been addressed?

cherrymui commented 1 month ago

Thanks for the discussion and update, @johanbrandhorst , @ydnar , and @aclements !

bool is also somewhat similar to the "small integer types", in that it is i32 on the Wasm side (although converting a bool to a (u)int32 needs more code than a simple conversion expression). Should we treat bool one of the "small integer types"? Or is it too inconvenient to disallow bool to be passed directly?

ydnar commented 1 month ago

I can see reasonable arguments for and against allowing small integers and bool values to be included in the list of allowed param types.

The conversion between i32 and bool values is straightforward and can be documented, as can the expansion or truncation of small integers and i32.

LLVM handles this for TinyGo today, so we have a pattern to follow. We have working code that uses bool today (as uint8 either 0 or 1).

If simplifying this proposal is preferred, maybe removing bool from allowed param list is acceptable.