i31 as a ref type dimension like null

titzer commented 4 years ago

In the course of discussion https://github.com/WebAssembly/gc/issues/100, @gasche proposed making i31ref not a type, but a dimension in the ref type constructor. I'd like to pull this idea out for a dedicated discussion because it is not entirely clear to me if we can add this later or must add this up front. It also seems like it can make a difference in code generation, so we should probably think it through.

Concretely, instead of (or perhaps, in addition to) a single i31ref type which is in the middle of the hierarchy between anyref and (ref $T), we have:

For each type $T, then: (ref $T) which is a subtype of (ref i31 $T)

It is thus another independent dimension much the same way as null is a dimension for this type constructor.

(ref i31 $T) subtype of (ref null i31 $T)

When thinking about the code that would be generated by an engine for various reference operations, some engines may (already) need to generate these null checks:

null checks for struct.get, struct.set and corresponding array operations
null checks for casts between struct and array types

Of course, most engines will hopefully be able to do these implicit null checks via virtual memory tricks, but not all can.

We've been discussing the additional checks that would be needed for i31 tag checks in the context of downcasts from i31ref to a more specific reference type. In particular, they would be implicitly needed because i31ref is proposed as a subtype of anyref above all declared heap types (AFAICT).

I'd like to point out that tag checks can be avoided if we think of i31 values as just a big group of nulls: engines can elide the checks based on static information, which is exactly the same information provided in the null ref type dimension.

It also gives more expressive power to programs, because it is a very limited union type. In particular, using (ref i31 $T), a module could perform a (struct.get $T i) that has an implicit tag check that would trap on an i31 value in much the same way as it would trap on null. Without this, a program would have to use i31ref types everywhere and downcast everywhere explicitly; a potentially more expensive check than just a tag check.

sabine commented 4 years ago

I see that WASM already has 64-bit integers, which suggests, that, eventually, WASM might be usable as a 64-bit compilation target. We are very interested in a spec that allows 63-bit unboxed scalars in addition to 31-bit unboxed scalars.

With ref i63 $T not being a subtype of anyref, it does not matter which pointer size the WASM engine uses internally.

Do you think it will be possible to import/export ref i31 $T explicitly from a WASM module (i.e. not as a private type or anything, just a plain import from one module and export from another)?

jakobkummerow commented 4 years ago

I'm generally supportive of this suggestion. In particular because it allows modules that don't need/use i31ref to avoid some i31 checks (they may be cheap, but not doing them at all is even cheaper).

I'm not sure how many checks would be avoidable in practice. I assume that the type lattice will have to be:

eqref == (ref null i31 eq) <: anyref == (ref null i31 any)
(ref null i31 $T) <: eqref  (for $T being a struct or array type)
(ref $T <: (ref null $T) <: (ref null i31 $T)
(ref $T <: (ref i31 $T) <: (ref null i31 $T)

So in particular, downcasts from anyref/eqref to (ref $T)/(ref null $T) will still have to include i31 checks; only downcasts from (ref any) or (ref null any) or (ref eq) or (ref null eq) can skip the i31 checks.

With ref i63 $T not being a subtype of anyref

@sabine : I would have assumed that all (ref ...) types will continue to be subtypes of anyref, in which case (ref i63 ...) would force all pointers to be 64 bit wide (and would rule out NaN boxing as an implementation strategy). At any rate, I think i63 is a separate discussion, and fairly orthogonal to this issue.

sabine commented 4 years ago

Ah, sorry about derailing. If all (ref ...) types must still be subtypes of anyref, then (ref i63 ...) is obviously not a reasonable thing to have.

jakobkummerow commented 4 years ago

I just realized that #76 creates an interesting wrinkle here. If we have to disallow (ref i31 extern), then I think we have to disallow (ref i31 any) too, but if anyref means (non-i31!) (ref null any), then all the (ref i31 $T) can't be subtypes of it, and the eqref shorthand can't have the i31 bit either. Modules could still define (ref i31 eq), but the type would be incompatible with anyref or eqref.

So on the one hand, that would imply that (contrary to my previous post) downcasts from anyref would not have to include i31 checks; but on the other hand, it would also mean that there's no way to define a type for a global/local/parameter/field that can hold a struct/array ref, or an i31, or a funcref (which would have been nice in order to close a functionality gap compared with the "two-types" proposal).

One solution would be to introduce more general union types.

One solution would be to introduce another built-in type that sits between anyref and its non-externref subtypes, i.e. would be the new supertype of funcref and eqref.

Am I overlooking anything?

RossTate commented 4 years ago

Nice catch, @jakobkummerow! To rephrase, no GC proposal with i31ref can have a common supertype between i31ref and externref, which in particular means no top type. Currently the designs of the Type Imports and GC proposals (and, to a lesser extent, Typed Function References) are premised around having a top type. So either we redesign around not having a top type and instead letting each related family of reference types specify whether it needs boxed scalars (and possibly what kind of boxed scalars), or it seems we need to remove i31ref (likely forever).

tlively commented 4 years ago

It seems like this simplest and least intrusive change would be to split externref off of the type hierarchy. Applications that need to use subtyping and casting could just use anyref in its place. But just to check my understanding, #76 means that there would be some conversion overhead under this solution because JS numbers as anyrefs would need to be boxed to be differentiated from i31refs?

RossTate commented 4 years ago

You would also have to either disallow imported types from representing externref or also split imported types off of the type hierarchy.

jakobkummerow commented 4 years ago

76 means that there would be some conversion overhead under this solution because JS numbers as anyrefs would need to be boxed to be differentiated from i31refs?

That boxing would have to happen only if externref and i31ref overlap. It's a specific instance of a more general problem though: an externref pointer, being host-controlled, can have any arbitrary bit pattern. If we specify i31ref in Wasm with the expectation that it is encoded by giving one or more bits in a pointer particular meaning, then that means that in general, externref pointers could accidentally "look like an i31", which might lead to weird/unspecified behavior. Making sure that there is no overlap between externref and i31ref (or i31 type dimensions) means that externref pointers (including JS numbers) never have to go out of their way to accommodate the existence of i31 in Wasm.

titzer commented 4 years ago

@rossberg any thoughts?

If externref is a completely foreign reference type and split off the hierarchy, then an easy way to incorporate this into the binary format would be instead of the type construcotrs (ref null T) and (ref T) being 0x6C and 0x6B respectively (if encoded as a single byte), we could do:

(ref T) : 0x68 (0b01101000) (ref null T) : 0x69 (0b01101001) (ref i31 T) : 0x6A (0b01101010) (ref i31 null T) : 0x6B (0b01101011)

Basically, the low order two bits indicate the two orthogonal dimensions of the type: whether null or i31 values are allowed.

Also, currently anyref has its own type constructor, but we could just as easily specify that heap type index -1 (i.e. invalid) means any, so anyref would be encoded by one of the four above type constructors, followed by a -1 for the heap type index.

dcodeIO commented 4 years ago

What would be the non-canonical type of i31ref? (ref i31 i31)?

        any               i31
   ┌─────┼────┬─────┐
extern  func  eq   exn
         │    │
   signature* │
              │
          ┌───┴────┐
       struct*   array*

Dimensions: null? i31?

Given the i31 and extern overlap, I guess what's proposed above is

          any        i31     extern
    ┌──────┼─────┐
  func     eq   exn
    │      │
signature* │
           │
       ┌───┴────┐
    struct*   array*

Dimensions: null? (i31 | extern)?

disallowing mixing (ref i31 extern T)?

Makes me wonder, how does (ref null extern) work if the underlying extern pointer can have any bit pattern, as mentioned above, and is the exact pattern indicating null? Edit: Thinking further, that's probably covered by the type system, and while a host can theoretically pass in the bit pattern for null into an (ref extern), that's not a problem in practice?

titzer commented 4 years ago

I'm not sure what is meant by the "non-canonical type" of i31ref. Under this idea, i31 is not a type, in the same way that null is not a type.

I am also skeptical about the claim that externref values can have any bit pattern. If that were the case, then it could not have a common supertype with any other wasm value, heap or not, regardless of i31, since it would then be impossible to distinguish it (e.g. for a downcast) based on its bits alone. That's part of what that other issue was getting at: externrefs would have to be tagged by either being boxed or the top type being a fat pointer. The mapping of JavaScript numbers onto Wasm numbers is a semantic concern, and we're talking about representations here.

I think it's closer to the truth that a Wasm engine generally knows something about the encoding of externrefs of its host environment (e.g. they must be valid pointers into a common address space) and perhaps can work around those encodings if necessary. They generally aren't arbitrary bits.

dcodeIO commented 4 years ago

I'm not sure what is meant by the "non-canonical type" of i31ref. Under this idea, i31 is not a type, in the same way that null is not a type.

Hmm, interesting, so the closest type is (ref i31 eq) if non-nullable respectively (ref null i31 eq) if nullable, with initializers (i31.new (i31.const 0)) or (ref.null eq).

Makes me wonder about solving the i31 and extern overlap using a dimension then, since that'd imply removing externref as well which is already in phase 4 as part of reference types.

I am also skeptical about the claim that externref values can have any bit pattern. If that were the case, then it could not have a common supertype with any other wasm value

I am skeptical about externref in general. Once there's anyref, there isn't really much use for it anymore, and the last minute change of removing anyref and introducing externref made to reference types was already quite surprising to me. If there is indeed an overlap introduced here, which I'm not sure about anymore given your comment, then I think re-evaluating whether we really need the distinction might make sense. For instance, a clean approach to solve the overlap might be to rename externref to anyref in reference types (text format only change), and then build GC on top of that anyref, as initially planned. Or what am I missing?

rossberg commented 4 years ago

This is not a bad idea. It is one step further in the direction of a union type -- really, (ref null i31 $t) is (ref (union null i31 $t)). That raises the question, though, why stop there? Why shouldn't we also enable (ref (union null i31 $t $u $v)), which allows even more refined tracking of possible sets of references, and likewise allows eliding more checks.

In principle, any such refinement is useful. However, union types have rather problematic properties in terms of type-checking complexity and worst cases, which can explode quickly, in particular when also combined with intersection type, forms of which have also been suggested. And it's always possible to fall back to anyref as the largest union type, for the price of losing some precision. So it's a slippery slope, and I might be a bit hesitant to go there too quickly.

OTOH, maybe i31 is enough of a common case to justify tracking it specially (like null).

jakobkummerow commented 4 years ago

In the meantime, I've heard specific feedback that making i31 a ref type dimension would be more efficient/useful than the current proposal for surface languages that have primitive integers: they would typically represent such source-level integers as i31|$BoxedInt (for some custom $BoxedInt type). In the current proposal, this conceptual union type cannot be expressed more specifically than eqref, which has the downside that code processing such values must perform two checks: an i31 check, and then a checked cast to $BoxedInt. If there were a way to express this union type more accurately (either by making i31 a dimensional attribute (ref i31 $BoxedInt), or by introducing more general union types), then a single i31 check would be enough; at least if we type br_on_i31 properly.

rossberg commented 4 years ago

Yes, that's one obvious use case. Although you can easily find similar use cases for other kinds of small unions, so that alone isn't necessarily enough to justify special casing i31.

One problem I realised with making i31 a dimension unlike other reftypes is that that creates a problem for type imports/exports (and possibly generics): naturally, you want to be able to instantiate a type import with i31, but that wouldn't be possible if it was segregated into a separate dimension.

Proper union types would provide a coherent solution, but suffer from complexity (in particular, quadratic subtyping). Maybe we need to look for some middle ground.

taralx commented 4 years ago

Is it still quadratic if we don't support width subtyping of unions? The use cases I've gathered don't need it.

rossberg commented 4 years ago

@taralx, do you mean that A or A | B would not be subtypes of A | B | C? Wouldn't that defeat the purpose?

taralx commented 4 years ago

A|B wouldn't be a subtype of A|B|C, but A and B would be subtypes of both.

rossberg commented 4 years ago

That wouldn't compose, though. Imagine you have A|T, where T is an import, and that is latter instantiated with B|C. You'd need to either disallow unions over imports or disallow instantiating imports with unions, both of which seem like rather severe restrictions.

taralx commented 4 years ago

Let me think a bit more about it, but that might be ok. It probably requires coercive subtyping, though, which we've been avoiding so far.

RossTate commented 4 years ago

Subtyping for arbitrary union types is quadratic without recursive types. Subtyping for equi-recursive types is quadratic without arbitrary union types. The combination of these features is not quadratic and in fact may not even be polynomial. At least for the subtyping rules that would enable the standard automata-based subtyping algorithms, subtyping is PSPACE-complete. And that's without taking into account the contravariance that would be introduced by function types.

taralx commented 4 years ago

I'm left feeling like the correct thing is just to bound the "size" of types to make the problem tractable rather than restricting functionality to linear or near-linear complexity.

RossTate commented 4 years ago

Unfortunately low-level structural heap types tend to be quite large. If you think of a language like Java, a class's structural type includes not only all of its fields but also all of its overridable methods (as they are part of the v-table) as well as its meta-data (e.g. the data-structures used for interface-method dispatch, casting, and reflection) and then recursively the same components of any classes referenced by those fields or methods.

askeksa-google commented 2 years ago

I would like to resurface this issue. As @jakobkummerow mentions in https://github.com/WebAssembly/gc/issues/130#issuecomment-725415964, having i31 as a type dimension will save a cast during integer unboxing when integers use a hybrid i31-or-heap-box representation, which is the typical use of i31.

It goes further than this, though. When the source-language integer type is a subtype of other types (as is the case in Dart, where int has 3 supertypes: num, Object and Comparable), all of those supertypes can also potentially be i31, which means that every use of them requires an explicit cast on top of the i31 check. With i31 as a type dimension, that extra cast can be avoided.

There has been some discussion here about how to express in the type that a value is definitely an i31. My suggestion would be to not have such possibility, just like we currently don't have any possibility to express that a value is definitely null, even though that situation arises frequently. If you know that something is an integer, use an integer type for it.

With that in mind, it makes sense for br_on_i31 and br_on_non_i31 to convert the value to an i32 on the i31 branch. This is basically the only reasonable thing you can do with an i31 value, and with the typical representation of i31, such a combined check and conversion can be done very efficiently (just shift one to the right and branch on the carry flag).

Concretely, I propose to change the `i31`-related instructions as follows:	Instruction	Change
`i31.new`	Now takes a type immediate, similarly to `ref.null`.
`i31.get_s`, `i31.get_u`	Take any reference. Trap if the input is not an `i31`.
`ref.is_i31`	As before.
`ref.as_i31`	Replaced by `ref.as_non_i31`.
`br_on_i31`	Split into two: `br_on_i31_s` and `br_on_i31_u`. The `i31` branch delivers an `i32`.
`br_on_non_i31`	Split into two: `br_on_non_i31_s` and `br_on_non_i31_u`. Fall-through delivers an `i32`.

Conceptually, it would make sense for many of these instructions to require the input to be i31able, but due to implicit upcasting, such a requirement would be inconsequential.

askeksa-google commented 2 years ago

Another consideration is how other instructions will handle i31able types and i31 values. For maximum consistency with the current treatment of null, this would look something like:

Instructions that currently perform an implicit null check for nullable inputs will also perform an implicit i31 check for i31able inputs.

The type sharpening instructions ref.as_func, ref.as_data, ref.as_array, ref.as_non_null and ref.cast will preserve the i31ness of the input type and will let i31 values through at runtime. This means that a cast all the way from anything to a specific type could involve up to four steps: ref.as_non_i31 ref.as_non_null ref.as_data ref.cast While the only requirement on the ordering of these is that ref.as_data must come before ref.cast, there could be a "recommended" ordering that is like to produce the best code in (baseline) compilers by avoiding redundant checks.

br_on_ instructions will reject i31 values.

We could also decide to diverge from null here, or to reconsider how null is treated.

rossberg commented 2 years ago

Not having a non-nullable i31ref would be bad. Then we would perhaps save some casts here, but for the price of requiring more checks there. FWIW, I use the positive knowledge that something is a proper i31 in various places in my compilers, avoiding unnecessary checks.

The fact that we cannot express a nullref type is not a good reason – quite the opposite, in fact, since both @titzer and I have already run into that limitation, which causes significant pain. So we already know it's bad design, and should rather fix that for null rather than proliferating it.

(Fortunately, that seems easy enough: we could treat null and i31 as two separate, compositional dimensions, as sketched earlier.)

However, a more fundamental problem is what I mentioned above: with type im/exports, we need a way to define a type export to be i31. Otherwise we lack essential expressiveness. That implies that we cannot treat i31 as an orthogonal dimension like null (where this does not matter), because any type import could be i31.

Bottom-line: if we want to separate i31, then we need union types in a more principled manner. Before, we thought that's not viable, because were concerned with the quadratic complexity of subtyping that unions implies. But perhaps the explicit subtype declarations introduced with the iso-recursive types proposal eliminates that problem?

titzer commented 2 years ago

@rossberg IIUC, you are referring to a scenario where a module B has a declared type import and module A would like to export a type (whatever we decide to call that type) that is only i31 values, and instantiate B with that type? And therefore A has to name some type that is only i31 values?

@askeksa-google I think it makes sense that i31.get_? would require the i31 dimension. If the input type doesn't have that dimension, they would statically be known to always trap, so it'd be degenerate. My design inclination would be to reject degenerate cases in validation--we do that for impossible casts right now, IIRC.

askeksa-google commented 2 years ago

(Fortunately, that seems easy enough: we could treat null and i31 as two separate, compositional dimensions, as sketched earlier.)

Yes, that is certainly the model I have been assuming all along.

All of the issues you mention would seem to be solved by introducing an empty (aka never aka bottom) heap type. Then a definite null is ref null empty, a definite i31 is ref i31 empty, and a nullable i31 is ref null i31 empty. I have run into the null limitation myself and would be in favor of introducing such a type. Even a plain ref empty can be useful sometimes.

Whether the br_on_ instructions will convert the i31 value to i32 or not is then a somewhat separable issue. Both would work in the empty-type scenario, since this retains the possibility of sharpening to specifically i31.

@askeksa-google I think it makes sense that i31.get_? would require the i31 dimension. If the input type doesn't have that dimension, they would statically be known to always trap, so it'd be degenerate. My design inclination would be to reject degenerate cases in validation--we do that for impossible casts right now, IIRC.

It certainly makes sense, but we can't express such a requirement without enacting an exception to implicit upcasting. This would be confusing and could potentially break instruction sequence composability. Though with empty, we could specify their input type to be ref null i31 empty (essentially reverting to their current requirement).

titzer commented 2 years ago

I was also thinking through the implications of an explicit empty heaptype and I think it would also work to sneak in the "universal" null by way of ref.null empty.

I don't understand the issue with implicit upcasting (by this I assume you mean subsumption). Can you give a more concrete example?

askeksa-google commented 2 years ago

I was also thinking through the implications of an explicit empty heaptype and I think it would also work to sneak in the "universal" null by way of ref.null empty.

We could drop the type immediate on ref.null and always type it as ref null empty, since that is assignable to any nullable type.

I don't understand the issue with implicit upcasting (by this I assume you mean subsumption). Can you give a more concrete example?

Say we specify that an instruction requires its input to be ref i31 T for some T. Since ref T is a subtype of ref i31 T, it would be valid to give the instruction something which has static type ref T, essentially making the i31 requirement ineffective. We can only allow an input to be i31, not require it.

This is no longer the case with empty, since we can then specify the input type as ref i31 empty, meaning "i31 or nothing", thus practically requiring it to be i31.

With empty, the instruction changes listed above are no longer needed. We could keep the existing i31 instructions as they are (replacing ref null? i31 by ref null? i31 empty) and just add ref.as_non_i31.

rossberg commented 2 years ago

@titzer:

@rossberg IIUC, you are referring to a scenario where a module B has a declared type import and module A would like to export a type (whatever we decide to call that type) that is only i31 values, and instantiate B with that type? And therefore A has to name some type that is only i31 values?

Yes, exactly. From the perspective of implementing a module interface that provides an abstract type, choosing i31 as its representation is no different from choosing e.g. a concrete struct. Consequently, they must be handled more or less symmetrically. (To that end, there already is TODO in MVP.md about enabling i31 as a type definition.)

More generally, of course, you'd want to be able to pick a union of multiple such possibilities. We do not currently have a specific way to represent that, other than falling back to anyref. I31 is not particularly special in that regard.

@askeksa-google:

All of the issues you mention would seem to be solved by introducing an empty (aka never aka bottom) heap type.

We could drop the type immediate on ref.null and always type it as ref null empty, since that is assignable to any nullable type.

FWIW, that was the original design of reference types, but there was pressure to remove the nullref type in fear of bottom types. Though the argument for doing so was never really substantiated, and it was detrimental to the overall design, as we have already been seeing. I'd be happy to fix that, regardless of i31.

Yet, this would not solve the larger problem of providing a coherent story for type im/exports of i31 representations. That is quite fundamental.

I'll think some more about generalising this proposal to "pre-declared union types" to address that.

askeksa-google commented 2 years ago

Yet, this would not solve the larger problem of providing a coherent story for type im/exports of i31 representations. That is quite fundamental.

Then I'm not sure what you mean here. Do you mean that it should be possible to instantiate a type import with i31ref? That would still by possible with i31 as a type dimension and empty. ref null? i31 empty will mean exactly the same as ref null? i31 meant before.

rossberg commented 2 years ago

@askeksa-google, a type export applies to a type definition from the type section. That contains func, struct, or array definitions – and the aforementioned TODO would add i31 as a definition. This is orthogonal to dimensions in a reftype. So turning i31 into a dimension would move it to the wrong category, making it incompatible with exports.

Minimal toy example of a client module:

(module $A
  (type $t (import "B" "t"))
  (func $make (import "B" "make") (param i32) (result (ref $t)))
  (func $get (import "B" "get") (param (ref $t)) (result i32))

  (func (export "run") (param $x i32) (result i32)
    (call $get (call $make (local.get $x)))
  )
)

Now, by basic principles of abstraction, we need both the following implementations for B to be possible (among others):

(module $B_boxed
  (type $t (export "t") (struct (field $n i32)))
  (func (export "make") (param $x i32) (result (ref $t))
    (struct.new (local.get $x) (rtt.canon $t))
  )
  (func (export "get") (param $r (ref $t)) (result i32)
    (struct.get $n (local.get $r))
  )
)

or

(module $B_unboxed
  (type $t (export "t") i31)  ;; note TODO in MVP.md
  (func (export "make") (param $x i32) (result (ref $t))
    (i31.new (local.get $x))
  )
  (func (export "get") (param $r (ref $t)) (result i32)
    (i31.get_s (local.get $r))
  )
)

(For the sake of simple example, I'm assuming that the module's contract implies that the parameter has only, say, 16 significant bits.)

titzer commented 2 years ago

@rossberg The above reason is why I have always thought that type imports and exports should be types, and not heap type definitions. Then the constraint on a type import is just a subtype bound, which can specify any type, including our simple union type mechanism.

tlively commented 2 years ago

It would be nice if we could put off figuring out how to make this work with type imports and exports. Without considering that proposal, it seems that we would all agree that adding a bottom type and making i31 a type dimension would be a good solution for the GC proposal.

rossberg commented 2 years ago

@titzer, I know you do :), but that is a way more complex and much less obvious feature, and one that requires fundamental changes to Wasm's compilation model, as you know (also, to its type indexing model). To be honest, I still would not know how to design all that concretely.

It also is no substitute for exporting heap types. When a module exports some data representation, then clients may need to be able to, for example, form both nullable and non-nullable references to it (e.g., to define locals/globals/tables). Hence, requiring that choice to be made on the exporting side typically is the wrong factorisation (even if there may be use cases for that a well). Also note that null is very different from i31 in that regard.

@tlively, we can't put it off, since the suggestion would paint us into the wrong corner. What the above shows is that making i31 a property of the pointer instead of a representation type is a category mistake, even though the unboxing optimisation for i31 refs can easily mislead us into thinking otherwise. Or to put it differently: there is a fine line between designing semantics that enables certain optimisations and making optimisations the semantics itself. :)

The motivating argument of wanting to express a certain kind of type union to save some checks applies equally to other type unions. That suggests that we should approach this from a principled angle.

rossberg commented 2 years ago

While thinking more about how to pull off union types, I reread the OP of this issue and realised that some of the assumptions are actually off. Sorry that I had not noticed this earlier!

Concretely, instead of (or perhaps, in addition to) a single i31ref type which is in the middle of the hierarchy between anyref and (ref $T)

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types. That makes a significant difference regarding your conclusions further on. Specifically:

We've been discussing the additional checks [similar to null checks?] that would be needed for i31 tag checks in the context of downcasts from i31ref to a more specific reference type. In particular, they would be implicitly needed because i31ref is proposed as a subtype of anyref above all declared heap types (AFAICT).

I don't think there are nearly as many redundant checks as you seem to assume.

Let's walk through a few representative examples.

Example A: a representation that is the union of multiple struct types – for simplicity's sake, let's assume just two:
```
type $i32  =  struct i32
type $str  =  array i8
```
In the MVP, a value with such a representation can by typed as dataref. That rules out i31 altogether, i.e., there is no need to ever check for it, neither implicitly nor explicitly. Code processing a value $x with that representation will typically consist of the following sequence of type casts:
```
(local.get $x)   ;; stack has (ref data)
(br_on_cast $on_i32 (type $i32))
(br_on_cast $on_str (type $str))
(unreachable)
```
Nothing would change here with the addition of an i31 dimension. You'd use type (ref data) as before, and the generated code stays exactly the same.
Example B: same as A, but the representation also includes an i31 case. To allow that, the type of $x has to be generalised to eqref, yielding the following instruction sequence for processing it in the MVP:
```
(local.get $x)   ;; stack has (ref eq)
(br_on_i31 $on_i31)
(ref.as_data)    ;; now stack has (ref data)
(br_on_cast $on_i32 (type $i32))
(br_on_cast $on_str (type $str))
(unreachable)
```
The i31 dimension would allow us to use the slightly more precise type (ref i31 data) instead of eqref, which makes the downcast with ref.as_data unnecessary. Nothing else changes, once we know it's pure data, the i31 case does not matter anymore in the MVP either.

Furthermore, in the MVP we actually have eq = i31 xor data! That means that an engine can already deduce that the reference is (ref data) after the test for i31, and a simple peep hole optimiser can eliminate the ref.as_data as well.

Example C: same as B, but with only one struct case, say $i32. Again, the MVP would have to use eqref:

(local.get $x)   ;; stack has (ref eq)
(br_on_i31 $on_i31)
(ref.as_data)    ;; now stack has (ref data)
(br_on_cast $on_i32 (type $i32))
(unreachable)

This time, the i31 would help, because we could use type (ref i31 $i32) for $x. That simplifies the above to:

(local.get $x)   ;; stack has (ref i31 $i32)
(br_on_i31 $on_i31)  ;; after this, we have (ref $i32)
(br $on_i32)  ;; unconditional branch

So, the only case where an i31 dimension has a real benefit is a binary union like Example C, where there is only a single other case besides the i31. For all other examples, the MVP type hierarchy can already express essentially the same information. (Moreover, it would be straightforward to extend the hierarchy such that this remains so even if we extended eq in the future, i.e., by adding an abstract type for i31-or-data.) So overall, I think the win of an i31 dimension would be relatively small.

The biggest cost in the examples is actually the unnecessary br_on_str cast in the end, plus the seemingly redundant unreachable instruction. To eliminate it in more than just the special case of Example C, proper union types would be required. Then the sequence for Example B could be:

   (local.get $x)   ;; stack has (ref (i31 or $i32 or $str))
   (br_on_i32 $on_i31)  ;; after this, stack has (ref ($i32 or $str))
   (br_on_cast $on_i32 (type $i32))  ;; after this, stack has (ref $str)
   (br $on_str)  ;; this branch is unconditional

Obviously, this is not essential for the MVP. But if we wanted to optimise union representations better eventually, then that's the more useful route. And that requires no change to the MVP.

Does that make sense?

titzer commented 2 years ago

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types.

Ok, I did misunderstand this. But this means that if i31ref is not a supertype of (ref $T), then an explicit operation is required to turn a (ref $T) into an i31ref. This would just be subsumption otherwise. At the representation level, this is a nop.

If i31ref is indeed a completely disjoint (basically singleton) type, then AFAICT it already represents the union of dataref and i31 (but not below it, to the side). That has limited utility except for the very specific use case of implementing parametric types by erasure. It's a far more dynamically typed mechanism that what I expected we were designing.

As for the examples above, I think example C will be far more common than examples A or B, and I do think it's an important win. Otherwise, we are introducing a requirement that engines represent i31 efficiently (by not boxing and being able to unify them with references) but intentionally leaving out program's ability to specify more precise types, forcing them to do more case analyses (via casts) and redundant operations that would normally just be subsumption. I would take this even further than case C and say that accesses of struct fields of that have i31 in the reference type dimension should be allowed, and it's the engine's responsibility to dynamically test (and trap) on i31, like it does with null. We'll get the best code that way (in particular, a smart engine can unmap the low 2-8GiB of the engine's address space and all i31 values will cause a hardware trap, thus making the i31 check completely free, like checking for null).

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

rossberg commented 2 years ago

Ah, the type i31ref is not a supertype of any (ref $t)! In fact it is currently disjoint to all such types.

Ok, I did misunderstand this. But this means that if i31ref is not a supertype of (ref $T), then an explicit operation is required to turn a (ref $T) into an i31ref. This would just be subsumption otherwise. At the representation level, this is a nop.

I'm not sure I follow. Currently, there is no type with i31 representation that inhabits any (ref $t). Once I add the aforementioned todo to be able to define $t = i31, then type (ref $t) would in fact be equivalent to i31ref, so subtyping isn't even needed.

If i31ref is indeed a completely disjoint (basically singleton) type, then AFAICT it already represents the union of dataref and i31 (but not below it, to the side).

Hm, I'm still lost. The union of data and i31 is eq in the MVP. And when lifted to value types as references, eqref is the union of i31ref and dataref (modulo cleaning up nullability via #266). Nothing much else is needed for the time being, AFAICS, it all composes as one would expect.dataref and i31 are not in the same category (one is a value type, the other a heap type), so they are just incomparable.

That has limited utility except for the very specific use case of implementing parametric types by erasure. It's a far more dynamically typed mechanism that what I expected we were designing.

As for the examples above, I think example C will be far more common than examples A or B, and I do think it's an important win.

I have to respectfully disagree. AFAIAA, example C has come up in none of the compilers that have been prototyped so far. They all needed either A or B. I would expect B to be the most common case by far, basically every managed language will use it in some form or the other. I can imagine very few scenarios where C would come up and not be a subset of some larger B. Do you have a specific scenario in mind?

titzer commented 2 years ago

I can imagine very few scenarios where C would come up and not be a subset of some larger B. Do you have a specific scenario in mind?

Yes, it comes up when, e.g. packing algebraic data types. For example, if some cases can be packed into 31 bits and others need to be boxed. I will use this in my Virgil compiler, first for native targets (including Wasm sans GC), and then when I retool it to target Wasm iso-recursive types. It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing. The inclusion of i31 at all seems to have only been motivated by that one use case. i31 is a considerably more general mechanism at the engine level and affords many opportunities like the ones I just described. It's an implementation-level union and IMHO is analagous to the null value.

Forgive my slight misunderstanding above. I did actually implement all of this in Wizard, but that was a year ago.

In the last part I wrote:

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

AFAICT that still holds.

rossberg commented 2 years ago

Yes, it comes up when, e.g. packing algebraic data types. For example, if some cases can be packed into 31 bits and others need to be boxed.

Yes, that's the one example I could think of. But it only applies to the edge case where you have (a) exactly one non-nullary constructor, and (b) more than one nullary constructor (otherwise you could presumably use null). Not exactly the most common case. Why is optimising that case more pressing than any other?

It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

Hm, any JS implementation I have seen requires more than two cases – there are boxed numbers, strings, null, etc., none of which are JS objects. So JS rather is an example of a language that would not benefit.

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing.

Well, as I said, the B case comes up in practically all managed languages, and is far more frequent than the C case. C hardly ever comes up without B occurring as well. Even your ADT example is a case in point. I remain unconvinced that special semantics is justified for optimising such edge cases.

If we did special-case that, we'd essentially be designing an inherent performance cliff into the type system. That seems undesirable.

In the last part I wrote:

As a path forward, we could design the general mechanism of the i31 dimension but only allow specifying i31 with (ref data). That's effectively equivalent (except it allows subsumption of a struct reference to it), would result in exactly the same casts as above, and would be forwards-compatible with generalization.

AFAICT that still holds.

I didn't respond to that earlier, because I don't understand what that would achieve. As mentioned above, the type (ref i31 data) is effectively indistinguishable from (ref eq) at the moment, so rather redundant. Example C would require (ref i31 $t) specifically. So what would you gain under this restriction?

Also, any path forward should be generalisable to proper unions somehow. It's not clear to me how this would. AFAICS, it would be a design dead end. And let's not forget, it likewise remains completely unclear how to reconcile it with type exports.

askeksa-google commented 2 years ago

I know we need to be focused on use cases not unnecessarily drag the discussion towards generalities, but the compilers prototyped so far have taken only one approach to implementing polymorphism, which is erasure and dynamic typing. The inclusion of i31 at all seems to have only been motivated by that one use case. i31 is a considerably more general mechanism at the engine level and affords many opportunities like the ones I just described. It's an implementation-level union and IMHO is analagous to the null value.

I think this is very well put. A union between i31 and a struct type is the canonical use case for i31. It's just one use case, true, but it can be rather prolific in languages where integers are considered objects.

For instance, in Dart, the redundant cast (that would be avoided by the union) occurs in mainly the following situations:

Calling a method on any supertype of int, i.e. Object, num or Comparable.
Using a nullable int? that was promoted by a null check.

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

rossberg commented 2 years ago

@askeksa-google:

I think this is very well put. A union between i31 and a struct type is the canonical use case for i31.

The canonical use case is the union between i31 and a set of structs. The single struct case is no more special than the 2 structs case is. ;)

Does Dart not use a larger union? Of course, there will be places where you have narrowed it down to a binary union, but arent't there equally places where you have narrowed it down to a ternary union? Or a binary union not involving i31?

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

Now I'm confused. Wouldn't it be more significant if it was not cheap?

askeksa-google commented 2 years ago

Does Dart not use a larger union? Of course, there will be places where you have narrowed it down to a binary union, but arent't there equally places where you have narrowed it down to a ternary union? Or a binary union not involving i31?

I suppose num could be represented as i31 | _BoxedInt | _BoxedDouble (or _BoxedInt | _BoxedDouble if not using i31), which could save a cast in a few places, mostly when doing math directly on num values.

Since an i31 check is extremely cheap (typically one or two instructions, including the branch), avoiding this cast could be quite significant.

Now I'm confused. Wouldn't it be more significant if it was not cheap?

What I mean is that having an i31 check instead of a cast (which the union type enables, for instance in the int supertype method call case) is a significant saving because an i31 check is cheaper than a cast.

To be clear: having i31 as a type dimension or being able to express it using more general union types is all the same for my use cases. I just find i31 of dubious utility without any of the two (at least it becomes more of an unclear trade-off).

rossberg commented 2 years ago

@askeksa-google, the primary motivation for i31 is saving space, allocations, and garbage, which is quite significant. Cheaper checks are more like a nice by-product. Example C is the only one for which an i31 dimension would allow a cheaper check.

askeksa-google commented 2 years ago

Yes, that's one side of the trade-off. The other side is that using i31 for integers comes with additional overhead on integer boxing and unboxing (independently of this issue), plus, in the absence of i31/struct unions, additional overhead on accessing objects that might be integers.

rossberg commented 2 years ago

The other side is that using i31 for integers comes with additional overhead on integer boxing and unboxing (independently of this issue), plus, in the absence of i31/struct unions, additional overhead on accessing objects that might be integers.

Not sure I understand what you are referring to. Compared to what? If we did not have i31, then these cases would be even more expensive, right?

askeksa-google commented 2 years ago

I'll illustrate with some concrete examples of what various situations would look like in dart2wasm.

Boxing an `int`

Without i31:

(i32.const intClassId)
(local.get value)
(struct.new $BoxedInt)

With i31:

(local.get value)
(i64.const 33)
(i64.shl)
(i64.const 33)
(i64.shr_s)
(local.get value)
(i64.eq)
(if
  (local.get value)
  (i32.wrap_i64)
  (i31.new)
else
  (i32.const intClassId)
  (local.get value)
  (struct.new $BoxedInt)
end)

(There are several ways to perform the range check. They all seem to be similarly involved.)

Unboxing an `int`

Without i31, from an Object:

(ref.cast $BoxedInt)
(struct.get $BoxedInt 1)

Without i31, from an int?:

(struct.get $BoxedInt 1)

With i31, without union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    (ref.as_data)
    (ref.cast $BoxedInt)
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

With i31, with union type, from an Object:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data saved here
    (ref.cast $BoxedInt)
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

With i31, with union type, from an int?:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data and ref.cast saved here
    (struct.get $BoxedInt 1)
    (br $done)
  end)
  (i31.get_s)
  (i64.extend_i32_s)
end)

Method call on `int` supertype:

Without i31:

(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

With i31, without union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    (ref.as_data)
    (ref.cast $Top)
    (struct.get $Top 0)
    (i32.const selectorOffset)
    (i32.add)
    (call_indirect)
    (br $done)
  end)
  ;; Perform operation on i31
end)

With i31, with union type:

(block $done
  (block $is_i31
    (br_on_i31 $is_i31)
    ;; ref.as_data and ref.cast saved here
    (struct.get $Top 0)
    (i32.const selectorOffset)
    (i32.add)
    (call_indirect)
    (br $done)
  end)
  ;; Perform operation on i31
end)

With i31, without union type, known non-i31 receiver:

(ref.as_data)
(ref.cast $Top)
(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

With i31, with union type, known non-i31 receiver:

(ref.as_non_i31)
(struct.get $Top 0)
(i32.const selectorOffset)
(i32.add)
(call_indirect)

Conclusion

With or without an i31/struct union type, using i31 for integers comes with a definite code size overhead. Whether using i31 gives better performance depends on how the saved allocations stacks up against the overhead from the extra instructions executed. Without the union type, the odds are stacked more heavily towards the overhead outweighing the savings from the allocations.

titzer commented 2 years ago

@rossberg The term "edge case" is inherently subjective so we might get stuck in another holding pattern debating it, and that gets long with technical details, so I'll have to be brief.

It also comes up when, ironically enough, implementing dynamic languages like JavaScript. There, you might want i31 (aka SMI) to be unioned with (ref $js_object). Distinguishing between SMIs and JavaScript objects (as opposed to other structs in the language runtime) is extremely frequent.

Hm, any JS implementation I have seen requires more than two cases – there are boxed numbers, strings, null, etc., none of which are JS objects. So JS rather is an example of a language that would not benefit.

Of course I mean implementation-level object; boxed numbers, strings, null, undefined, which are all on-heap constructs in V8, and all on-heap constructs have a supertype which has a map. There's a complex hierarchy underneath that supertype of all heap objects, as we've both slogged through over the years. But checking the map word is extraordinarily common.

But it only applies to the edge case where you have (a) exactly one non-nullary constructor, and (b) more than one nullary constructor (otherwise you could presumably use null). Not exactly the most common case. Why is optimising that case more pressing than any other?

Just because a constructor isn't nullary doesn't mean its arguments can't be packed into 31 bits, e.g. if they are small primitives. A good example is the ADT I use to represent operands in the backend of the Virgil compiler. I specifically chose a limit of, e.g. 1 million vregs so that the VREG number, plus a tag, plus a hint, fits into a small number of bits, so it can be packed. But other operand cases might have a reference to something, so they are big and boxed.

ADTs are the example of the source-level construct that benefits most from having i31 as a representation choice for the source compiler to pack them. Since they are available to user programs then I don't think it's warranted to think of any user program's choice as an edge case.

tlively commented 2 years ago

Thanks for the concrete examples, @askeksa-google! What would it take to do the benchmarking to figure out the relative costs you mention in your conclusion?

WebAssembly / gc