WebAssembly / gc

Branch of the spec repo scoped to discussion of GC integration in WebAssembly
https://webassembly.github.io/gc/
Other
1k stars 73 forks source link

Ability to construct const/non-mutable arrays of unknown size from mutable arrays #570

Open mkustermann opened 3 weeks ago

mkustermann commented 3 weeks ago

It seems WasmGC currently can only create WasmGC arrays via the following instructions:

If the array's field type is const then after creation the array can only be accessed in read-only mode (accessing length & elements).

It can be very beneficial for compilers & optimizers to know that an array is const, so two loads at same index will always yield the same value, etc. So we'd like to take advantage of that. If we know the length of the array it's very convenient to use array.new_fixed. Though if we don't know the length and have the array values in a mutable array (i.e. field type is var) it seems the only way to create a const array from that would be to load elements from the mutable array, store into element section and then use array.new_elem. This seems super cumbersome.

Is there any simple way to create a const array with contents from a var array where the length is unknown?

It would be nice to have an array.copy() (related https://github.com/WebAssembly/gc/issues/313) or array.new_copy() (related: https://github.com/WebAssembly/gc/issues/367) that atomically create a const array and initializes it from a source array (const or var) - or a sub-range of one.

/cc @jakobkummerow @tlively @rossberg

mkustermann commented 3 weeks ago

Another issue regarding const/non-mutable arrays is that array.new_fixed is limited to 10k arguments. So it's rather hard to create const/non-mutable arrays that are larger than that (for mutable arrays that's easy to do via allocation + following store instructions)

rossberg commented 3 weeks ago

You are correct that the only way to create a useful immutable array is with array.new_fixed at the moment. The array.copy instruction you mention has been added to the proposal, but since it mutates its target, it isn't usable for immutable array. Something like array.new_copy would be a useful Post-MVP addition, as mentioned in #367.

For many or most use cases, it would be more flexible and useful to introduce readonly arrays (and generally fields), as mentioned in the Post-MVP document. That would also provide a way to initialise cyclic data structures, for which creating individual immutable arrays is not sufficient.

There also is a proposal for freezing values, though that's a more ambitious addition.

mkustermann commented 3 weeks ago

Thanks for the pointers, @rossberg !

For many or most use cases, it would be more flexible and useful to introduce readonly arrays (and generally fields), as mentioned in the Post-MVP document. That would also provide a way to initialise cyclic data structures, for which creating individual immutable arrays is not sufficient.

This mentions:

In order to prevent mutation after the fact, a third kind of mutability can be added: readonly. Unlike const, a readonly field or element can still be mutated, but only through another alias where it has var type. At the same time, readonly is a supertype of var (and also of const).

Though this isn't giving us the same opportunity for optimization: As a readonly field can actually be mutated via an aliased of var type. That means an optimizer such as binaryen or V8 cannot arbitrarily move loads from a readonly around or CSE them (e.g. hoisting it out of a loop may result in a different value being loaded as loop may modify it somewhere in callee that's invoked).

A common use case if a compiler emits immutable data structures (imagine deeply immutable/constant maps, sets, lists, ...). We want optimizers to be able to remove lookups at compile-time if the index is known at compile-time. Since such immutable data structures could easily exceed 10k limit, we cannot use array.new_fixed.

There also is a proposal for freezing values, though that's a more ambitious addition.

In here it says

... dynamic checks on freezable types accesses ...

This would incur a cost on write access to fields (a bit test from object header & trap if set I guess). So it incurs a performance penalty when creating those objects.

I think the need for deeply immutable non-DAG data structures is somewhat of a narrow use case compared to deeply immutable/constant DAG data structures.

A more ambitious thing would be to introduce a concept of ownership in the type system, then one could

This would be a zero overhead way to create array & initialize with full capability of existing instructions and afterwards get the benefits from const optimizations.

If that isn't feasible, we'd probably prefer to rely on a array.new_copy (https://github.com/WebAssembly/gc/issues/367). I'll leave also a comment on that issue (which is more concerned around fusing array creation & copy into it into one rather than about constness)