WebAssembly / component-model

Repository for design and specification of the Component Model
Other
946 stars 79 forks source link

Add optional fixed length to lists #304

Open lukewagner opened 7 months ago

lukewagner commented 7 months ago

This PR adds fixed-length lists as a "gated feature" (i.e., features spec'd and ready to implement, but not yet included in a Preview release), as proposed and discussed in #181.

badeend commented 7 months ago

I'm not too familiar with the ABI, but can a large fixed-size List blow the stack? If so, it might be an idea to fall back to heap allocation after a certain size.

lukewagner commented 7 months ago

The MAX_FLAT_PARAMS/RESULTS limits already in place end up protecting the number of "flattened" scalar params/results that go on the native stack (similar to how they limit the flattening of tuples), falling back to the heap, as you suggest.

ydnar commented 7 months ago

I’m glad to have fixed-width arrays added to WIT.

However from the perspective of an ABI implementer, I struggle with the representation of fixed-length arrays as a variation of list<T>, rather than as a new standalone type that shares some similarities to both tuple and list.

Concretely, I’d vote for a new type, let‘s call it array, that works exactly as described in this PR, but decouples its logic from list, leaving list unchanged from the existing spec. An array would unequivocally be a value type with non-optional type and len, leaving list as a type with a pointer to allocated storage.

The WIT syntax for array could be anything:

lukewagner commented 7 months ago

I think we shouldn't let the CABI representational concerns determine the WIT- and Component-Model-level names we choose; rather we should just focus on what's the least confusing things to folks reading/writing WIT.

Perhaps there is an argument that having both list<u8> and list<u8,4> share the same type-name "list" is confusing and so having different type names for the variable- and fixed-length cases is less confusing? I don't think so, but this is admittedly subjective. There is some precedent we can observe in Go and Rust, though: the syntax of the fixed-length thing ([T; N] in Rust, [N]T in Go) is a syntactic extension of the dynamic-length thing ([T] in Rust, []T in Go). Thus, there is evidence that having the fixed-length construct syntactically refine the variable-length construct is not confusing. But I haven't done an exhaustive survey of languages or anything.

Overall, while I'm not opposed to having array<T,N>, I think less names is better, as long as we're not introducing confusion. But I'd be interested to hear more thoughts from folks on this.

oovm commented 7 months ago

I don't think list contains any mutability constraints, neither capacity mutability nor internal element mutability.

Without any numbers, it means that this is a list with unknown capacity (unknown at compile time)

That is:

list<T>     = list<T, ?>
list<u8, 4> = tuple<u8, u8, u8, u8>

Mutability needs to be passed via abi options and verified

For example list<u8, 4> can be:

Even though length and size_of(u8) are known at compile time, they will still be passed in by host if required.


Therefore I suggest adding an abi option static-length, requiring the language to interpret it as an array, to be compatible with some languages ​​that do not have the concept of array(fixed length list).

Considering that array is a feature widely present in popular languages, it should require static by default, and then add the reverse dynamic-length

ydnar commented 7 months ago

Therefore I suggest adding an abi option static-length, requiring the language to interpret it as an array, to be compatible with some languages ​​that do not have the concept of array(fixed length list).

Considering that array is a feature widely present in popular languages, it should require static by default, and then add the reverse dynamic-length

How are tuple handled in those languages today?

oovm commented 7 months ago

@ydnar

For tuples, according to my classification, the current working mode is inline.

In other words, if list<u8, 4> is the syntactic sugar of tuple<u8, u8, u8, u8>, (i32, i32, i32, i32) will be passed in, instead of (ptr: i32).

Unless the input parameters are greater than 16 or the return value is greater than 1, static-length mode will be used. ​

package examples: tests;

world imports {
  import tuples;
}

interface tuples {
  tuple0: func(x: tuple<>);
  tuple1: func(x: tuple<u8>);
  tuple2: func(x: tuple<u8, u8>);
  tuple4: func(x: tuple<u8, u8, u8, u8>);
  tuple16: func(x: tuple<u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8>);
  tuple17: func(x: tuple<u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8, u8>);

  tuple-r0: func() -> tuple<>;
  tuple-r1: func() -> tuple<u8>;
  tuple-r2: func() -> tuple<u8, u8>;
}
(module
  (type (;0;) (func))
  (type (;1;) (func (param i32)))
  (type (;2;) (func (param i32 i32)))
  (type (;3;) (func (param i32 i32 i32 i32)))
  (type (;4;) (func (param i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32 i32)))
  (type (;5;) (func (result i32)))
  (type (;6;) (func (param i32 i32 i32 i32) (result i32)))
  (import "examples:tests/tuples" "tuple0" (func (;0;) (type 0)))
  (import "examples:tests/tuples" "tuple1" (func (;1;) (type 1)))
  (import "examples:tests/tuples" "tuple2" (func (;2;) (type 2)))
  (import "examples:tests/tuples" "tuple4" (func (;3;) (type 3)))
  (import "examples:tests/tuples" "tuple16" (func (;4;) (type 4)))
  (import "examples:tests/tuples" "tuple17" (func (;5;) (type 1)))
  (import "examples:tests/tuples" "tuple-r0" (func (;6;) (type 0)))
  (import "examples:tests/tuples" "tuple-r1" (func (;7;) (type 5)))
  (import "examples:tests/tuples" "tuple-r2" (func (;8;) (type 1)))
)
cpetig commented 2 months ago

I just created a new MR #384 with the same changes but all conflicts resolved