WebAssembly / gc

Branch of the spec repo scoped to discussion of GC integration in WebAssembly
https://webassembly.github.io/gc/
Other
982 stars 70 forks source link

Type matching with recursive types #502

Closed q82419 closed 8 months ago

q82419 commented 8 months ago

Hi, there's an assumption with needing your help.

Assume a module like this:

(module
  (type $st (struct))
  (rec
    (type $t1 (func (param i64 (ref $t2) structref)))
    (type $t2 (func (param i32 (ref $t1))))
  )
  (rec
    (type $u1 (func (param i64 (ref $u2) anyref)))
    (type $u2 (func (param i32 (ref $u1))))
  )
  (table funcref (elem $f1))
  (func $f1 (type $t1))
  (func (export "run")
    (call_indirect (type $u1) (i64.const 100) (struct.new_default $st) (i32.const 0))
  )
)

When invoking the "run" function, the call_indirect instruction will be executed. Then according to the description, the type $t1 (which is the function $f1's type) should match the type $$u1. As the matching of defined types, the type equivalence will be checked.

After rolling the types, it will be:

tie($t1) = (rec (func (param i64 (ref rec.1) structref)) (func (param i32 (ref rec.0)))).0
tie($u1) = (rec (func (param i64 (ref rec.1) anyref)) (func (param i32 (ref rec.0)))).0

As the matching rule, we can know the structref matches anyref. So the question is:

  1. When checking the equivalence of 2 defined types, does the $t1 equal to $u1?
  2. Does $t1 match $u1?

Thanks!

titzer commented 8 months ago

Matching of recursive types doesn't need to happen at runtime for call_indirect. Rather, call_indirect only needs to check the function's type according to the declared subtyping relation.

kripken commented 8 months ago
  1. That wat does not validate because (struct.new_default $st) is provided for a param of type (ref $u2) (and there is no subtyping between them; there can't be, as one is a struct and the other is a func).
  2. The structref != anyref difference between $t1 and $u1 is enough to keep the rec groups separate, so $t1 and $u1 do not match.

A simple way to play around with that is like this:

(module
  (rec
    (type $t1 (func (param i64 (ref $t2) structref)))
    (type $t2 (func (param i32 (ref $t1))))
  )
  (rec
    (type $u1 (func (param i64 (ref $u2) anyref)))
    (type $u2 (func (param i32 (ref $u1))))
  )

  (func (export "run")
    (local $x (ref $t1))
    (local $y (ref $u1))
  )
)

wasm-opt a.wat -all --print will print out two rec groups in that case, because they are distinct. But if you change the structref to anyref they will be merged and just one will be printed.

rossberg commented 8 months ago

What @titzer said: you need to declare $t1 to be a subtype of $u1 for it to be taken into account (unless they are actually equal, which they aren't here).

However, the attempt to declare $t1 to be a subtype of $u1 will be rejected as well, because structref <: anyref, but these types appear in contra-variant parameter position, so it would only work the other way round. But moreover, the recursive reference also occurs contra-variantly, so the subtyping for $t2/$u2 would have to go the opposite direction from $t1/$u1, which is only possible if you combine them all into one recursion. This would work:

  (rec
    (type $u1 (sub (func (param i64 (ref $u2) structref))))
    (type $t2 (sub (func (param i32 (ref $t1)))))
    (type $t1 (sub $u1 (func (param i64 (ref $t2) anyref))))
    (type $u2 (sub $t2 (func (param i32 (ref $u1)))))
  )
q82419 commented 8 months ago
  1. That wat does not validate because (struct.new_default $st) is provided for a param of type (ref $u2) (and there is no subtyping between them; there can't be, as one is a struct and the other is a func).
  2. The structref != anyref difference between $t1 and $u1 is enough to keep the rec groups separate, so $t1 and $u1 do not match.

A simple way to play around with that is like this:

(module
  (rec
    (type $t1 (func (param i64 (ref $t2) structref)))
    (type $t2 (func (param i32 (ref $t1))))
  )
  (rec
    (type $u1 (func (param i64 (ref $u2) anyref)))
    (type $u2 (func (param i32 (ref $u1))))
  )

  (func (export "run")
    (local $x (ref $t1))
    (local $y (ref $u1))
  )
)

wasm-opt a.wat -all --print will print out two rec groups in that case, because they are distinct. But if you change the structref to anyref they will be merged and just one will be printed.

Thanks for the information. Sorry that I made a mistake with ignoring the second ref $t2 argument. But your 2. answered my questions. That is, for closing the defined types of recursive types, the 2 recursive types will be equivalent iff the elements actually the same (but structref differ from anyref). Does it correct?

kripken commented 8 months ago

Comments might have raced here, but I believe @rossberg 's comment answers your last question @q82419 (i.e. they do not need to be exactly the same, if there is appropriate subtyping, which was not present in the original testcase).

q82419 commented 8 months ago

Another case is this module.

(module
  (rec
    (type $a1 (func (param (ref $a2))))
    (type $a2 (func (param (ref $a1))))
  )
  (rec
    (type $b1 (func (param (ref $b2))))
    (type $b2 (func (param (ref $b1))))
  )
  (rec
    (type $t1 (func (param i64 (ref $t2) (ref $a2))))
    (type $t2 (func (param i32 (ref $t1) (ref $a1))))
  )
  (rec
    (type $u1 (func (param i64 (ref $u2) (ref $b2))))
    (type $u2 (func (param i32 (ref $u1) (ref $b1))))
  )
  (func $f1 (type $t1)
    ;;  Please ignore the function body and assume that this is valid.
  )
  (global (ref $u1) (ref.func $f1))
)

When in validation, the result type $t1 should match the $u1.

tie($t1) = (rec (func (param i64 (ref rec.1) (ref $a2))) (func (param i32 (ref rec.0)))).0
tie($u1) = (rec (func (param i64 (ref rec.1) (ref $b2))) (func (param i32 (ref rec.0)))).0

Obviously the $a2 and $b2 will be equivalent. So $t1 will match the $u1 and this module will pass the validation? Thanks!

kripken commented 8 months ago

@titzer and @rossberg can explain the reasons much better than me, but as I mentioned before it is simple to get a quick answer to yes/no question like that. Here is your module with a small function that uses both types:

(module
  (rec
    (type $a1 (func (param (ref $a2))))
    (type $a2 (func (param (ref $a1))))
  )
  (rec
    (type $b1 (func (param (ref $b2))))
    (type $b2 (func (param (ref $b1))))
  )
  (rec
    (type $t1 (func (param i64 (ref $t2) (ref $a2))))
    (type $t2 (func (param i32 (ref $t1) (ref $a1))))
  )
  (rec
    (type $u1 (func (param i64 (ref $u2) (ref $b2))))
    (type $u2 (func (param i32 (ref $u1) (ref $b1))))
  )

  (func $test
    ;; Two locals to keep the types alive.
    (local $x (ref $t1))
    (local $y (ref $u1))
  )
)

Here is the output of running it through wasm-opt:

$ bin/wasm-opt a.wat -all --print
(module
 (rec
  (type $a1 (func (param (ref $a2))))
  (type $a2 (func (param (ref $a1))))
 )
 (rec
  (type $t1 (func (param i64 (ref $t2) (ref $a2))))
  (type $t2 (func (param i32 (ref $t1) (ref $a1))))
 )
 (type $4 (func))
 (func $test (type $4)
  (local $x (ref $t1))
  (local $y (ref $t1))
  (nop)
 )
)

As the output shows the engine merged $t1 and $u1 together ($u1 no longer appears in the output, and the local using that type now uses $t1). So the answer is yes, they match.

This is fairly complicated stuff, so using tools can be very useful. I use them all the time when I forget how something works.

q82419 commented 8 months ago

Thanks a lot for all information in this issue. I think these examples and the examples in the test suite would help me to understand the spec in detail, especially about the type matching. Besides the modified spec, I can find some examples in MVP doc and the wast file in test.

Hope the above information can help the others who have the similar questions as me. Thanks!