Make funcref not a subtype of anyref

WebAssembly / reference-types

Proposal for adding basic reference types (anyref)

https://webassembly.github.io/reference-types/

Other

162 stars 40 forks source link

Make funcref not a subtype of anyref #69

Closed RossTate closed 4 years ago

RossTate commented 4 years ago

(This idea came up after yesterday's discussion about the GC extension. I have tried to describe it here in a self-contained matter, but let me know if there are any terms I forgot to define or motivations I forgot to provide.)

Having funcref be a subtype of anyref forces the two to have the same register-level representation. Yet there are good reasons why an engine might want to represent a function reference differently than an arbitrary reference. For example, function references might always be an assembly-code pointer paired with a module-instance pointer, effectively representing the assembly code compiled from a wasm module closed over the global state of the specific instance the function reference was created from. If so, it might make sense for an engine to use a fat pointer for a function reference. But if funcref is a subtype of anyref, and if it overall makes sense for arbitrary references to be implemented with normal-width pointers, then that forces function references to be implemented with normal-width pointers as well, causing an otherwise-avoidable additional memory-indirection in every indirect function call.

Regardless of the reason, by making funcref not a subtype of anyref, we give engines the flexibility to represent these two types differently (including the option to represent them the same). Instead of subtyping, we could have a convert instruction that could take a function reference and convert it into an anyref representation, or more generally could convert between "convertible" types. The only main benefit of subtyping over conversion in a low-level type system is its behavior with respect to variance, such as co/contravariance of function types, but I see no such application for funcref and anyref. And in the worst case, we could always making funcref a subtype of anyref later if such a compelling need arises.

lukewagner commented 4 years ago

To add to this, one change in my understanding of the constraint space is that, in earlier design thinking, there was an expectation that anyref would serve as the universal representation of reference types in the presence of source language polymorphism that got erased when compiling down to wasm, and this necessarily required anyref to be the (non-coercive) supertype of all reference types. This design included the ability to downcast from anyref directly to any other type (given the appropriate type tag; all of this in the context of the GC proposal). However, more recent thinking wrt how best to support dynamic downcast suggests that anyref probably isn't the right universal representation type and that, instead, the universal representation type for a particular language/toolchain should be some ABI-defined wasm type that explicitly declared more details of what was downcastable and how. There are multiple ideas in flight here, which I expect will be hashed out over the coming months, but the common theme seems to be moving away from anyref as the universal representation.

Thus, in the short term, I think we should avoid unnecessarily committing to funcref <: anyref in the reference-types proposal, which is nearing stage 4 and shipping. Implementation-wise, this should be a fairly simple change to make, I believe.

The win from not unnecessarily requiring funcref <: anyref is that, as Ross explained, it buys us more implementation freedom. I actually think an AOT compilation model of type imports will require all references to be pointer-width (otherwise a VM would have to wait until instantiation time to know the number of bytes to shuffle around for a (ref $TypeImport)). However, what I think dropping funcref <: anyref would buy us is the ability to have different word-sized representations of funcref vs. anyref that didn't require them to share a single boxing format. (This assumes that the GC can either use arena or instance data to trace (ref $TypeImport) words in the generically-compiled module code.) In particular, some recent (and I think essential for a high-performance function-reference implementation) ABI changes we're planning for SM would be simpler and more efficient if funcref could have a pointer-sized, but completely different representation from anyref.

rossberg commented 4 years ago

@lukewagner, can you explain under which scenario you would be able to allow different tagging strategies for function references but not different sizes?

Because I see only two possibilities: either the universes of funcrefs and other refs are always statically distinguishable, in which case they can also have different sizes. Or there are cases where you cannot tell them apart statically, in which case you'll also need compatible tagging schemes.

So, AFAICS, to allow different representations you have to create two properly disjoint type hierarchies. That is certainly possible, but is a less trivial change than just removing the funcref <: anyref rule. You would need to make sure that there are no common super types, but also no common subtypes and values. One immediate implication, for example, is that you'd need to distinguish different null values and types, and thus modify or complement the ref.null instruction. That in turn would not just affect the reftype proposal but also the bulk memory proposal. There may be other such implications, we'd have to think it through carefully.

Another question is how this interacts with the host. You can pass Wasm functions to JS and JS functions to Wasm. Both would necessarily become a coercive operation. That works in the first-order case, but does not compose. E.g. once you want to allow Typed Values over functions in JS you'd have to distinguish e.g. arrays of JS functions from arrays of Wasm functions. That is, you would need to introduce two disjoint universes of function types for Typed Values. (Maybe you're already assuming this?)

The last time separating funcrefs from anyref was discussed (I think at the Lyon meeting) the consensus was that the added language complexity was not worth it for the time being, and that we can always add a second form of funcref type later if there is evidence that it is sufficiently useful.

Btw, regarding uniform representations, I think we have thrown up some vague ideas for possible alternatives, but AFAICT nobody has worked them out and they'll likely be beyond MVP level. It seems premature to assume that we won't need anyref. If we divorced funcrefs then additional boxing may become necessary in client code (which may not be a problem, however).

RossTate commented 4 years ago

Yeah, we discussed the issue with null too, concretely discussing how a universal nullref type would block engines out of using fat pointers where appropriate. I also mentioned here that we were finding, contrary to our original expectations, that nullref should be parameterized.

rossberg commented 4 years ago

I guess it boils down to the question: do we reconsider "fat function refs" to be an MVP feature or can we add it later as a separate type? If the former, I agree that we should do so quickly, so that we can move forward at the meeting.

RossTate commented 4 years ago

Well, there's another question: what are the pressing use cases of having funcref be a subtype of anyref in the MVP? (Use cases that aren't addressed by having funcref be convertible to anyref.)

rossberg commented 4 years ago

@RossTate, I don't know about pressing, but I can think of a number of softer reasons: it makes for a somewhat simpler language, gives more seamless host interop (at least on the web and in the C API), avoids redundant annotations in element segments (for which we may want to come up with a solution otherwise), and of course, avoids late phase churn for a number of implementations.

RossTate commented 4 years ago

I disagree on the simpler language argument. People argued for F-bounded polymorphism on the same basis that having everything be done through subtyping would be simpler. Instead they got undecidability and unnecessary complexity that crippled tooling.

As for late-phase churn, this should only require a change in the type-checker, and I imagine removing a subtyping is fairly straightforward.

Can you go into more depth on one of the other reasons?

rossberg commented 4 years ago

@RossTate, it requires changes/extensions to the instruction set, see ref.null, and the type language, see nullref. It also requires observable changes to current APIs, where some of these are reflected. For the C API, for example, the changes would go deeper, because that currently models a simple uniform ref space, and externval's are just refs. Removing that possibility (as opposed to just complementing it) requires a redesign of that part, replacing it with something more complicated.

Can all be done, no question. But clearly not free or cheap. I'd be interested in hearing from @lukewagner how important he considers having this for the first version, and whether it should fully replace or merely amend the current design.

(I don't follow which similarity to F-bounded quantification you see.)

RossTate commented 4 years ago

F-bounded polymorphism is essentially a way of encoding a limited form of type classes into subtyping. In particular, X <: Comparable<X> ensures via subtyping that values of X have a compareTo(X) method. This was considered "simple" because it repurposed the existing subtyping feature. However, because subtyping is an extremely critical feature that interacts heavily with other features of the language, this "simplicity" made much of type-checking, even subtyping itself, undecidable, and broke much of the tooling for languages that added this feature.

Of course, the actual use case this was meant to address, i.e. the binary-method problem, could have just as easily been addressed by introducing a new kind of type constraint. X satisfies Comparable<X> similarly ensures that X has all the methods of Comparable<X> but without making X a subtype of Comparable<X>. Although this design adds a new kind of constraint, it also obliviates the need for recursive subtyping and recursive class/interface hierarchies, keeping the language as a whole much simpler.

This satisfies design is theoretically less expressive than the F-bounded design. That is because subtyping interacts with things like variance of functions and data structures. But it turns out there are not actually real use cases for that difference in functionality. So in practice, the two designs are equivalently expressive, with one being much simpler to implement even if its grammar is slightly more complex. The satisfies design also turns out to be more flexible in what kinds of implementations and extensions it permits.

So the analogy is that the subtyping funcref <: anyref seems to have been added because it was the way to make funcref convertible to anyref requiring the fewest changes to the grammar. But you have yet to give me a use case where the additional expressiveness this provides over just a conversion instruction actually enables more useful WebAssembly programs. So I worry that you are permanently saddling the type system with additional complexity and limiting flexibility in the implementation strategies without any real benefit (besides the fact that it will take a relatively small amount of effort to correct the design) due to a faulty sense of simplicity, much like what happened with F-bounded polymorphism.

rossberg commented 4 years ago

I'm aware of the risks of F-bounded quantification if taken too far, but come on, there is no similarity with the issue at hand. Its impact on the type system and its meta theory is negligible either way. If anything, establishing disjoint type hierarchies is a marginal complication to the meta theory, since you'd be under obligation to prove a little lemma that verifies that property.

I have given you reasons above, like fewer and simpler instructions, simpler and more uniform host APIs, etc. Nothing earth-shattering, but arguably an appropriate default for an MVP, unless there are crucial problems that the MVP has to address or it precludes later addition. Evidence for the former would be it being a roadblock for engines.

lukewagner commented 4 years ago

@rossberg (sorry for the slow reply) If a type is imported with no bounds:

(module
  (import "file" "handle" (type $Handle))
  (func $f (param (ref $Handle)) ...)
)

then $Handle can be either anyref or funcref and so if we want to compile $f AOT (and never recompile), we need all ref's to have the same byte width because we don't yet know whether we have an anyref subtype or a funcref subtype. However, because there is no bound, there are no explicit operations in $f that need to dynamically distinguish an anyref from a funcref, so they don't need to share a boxing/tagging format. (There is still, however, the implicit operation of marking a (ref $Handle) value found on the stack during GC, but that can be a somewhat slower operation (which, e.g., determines $Handle using the frame's containing instance's import table, which will take 10s of cycles) compared to a the core wasm GC instructions where every cycle counts.)

On a side note: if we wanted to allow different byte widths without recompilation, we could perhaps always require type imports to have a bound, and so you'd always know if you had an anyref subtype or a funcref subtype, and then they could have distinct byte-widths. I don't have a good sense of the pros/cons of this yet option, though.

rossberg commented 4 years ago

@lukewagner, not sure that could work in general. Depending on how much code generation in a given engine has to know about GC, it probably will need to know what kind of tagging e.g. a local of that type uses, e.g., to generate suitable stack frame descriptors, or inlined bits of GC code, etc. It's safer to assume that a compiler always needs to know the representational interpretation of any value it has to handle.

But in any case, the type import proposal already defines what you suggest: all imports have a bound -- given subtyping, there's no advantage in having an unbounded form.

lukewagner commented 4 years ago

@rossberg Ah, in that case, then I agree that, if funcref is not a subtype of anyref, it would allow funcrefs to have a completely different representation (including byte width) than anyrefs.

binji commented 4 years ago

We discussed this in the Feb 2020 CG meeting. We didn't come to any conclusion, but there was some agreement that breaking the subtype relationship between anyref and funcref would be more conservative.

tlively commented 4 years ago

Given that multiple ecosystems (Emscripten, Rust, WASI) eagerly want anyref support but don't particularly need funcref right now, I would like to either resolve the subtype question quickly or revisit @fitzgen's idea of splitting the proposal so that anyref can move forward alone without being held back by this issue. I understand that the semantics would be ugly in the interim, but it would be shame to hold up these ecosystems for a temporary cosmetic issue.

Of course, there would be much less overhead if we could agree to a solution to the subtype question soon and not have to split the proposal. I lean heavily toward not having a subtype relationship between funcref and anyref right now since it is both the simpler conservative option and because I do not know of any immediate usecase for that subtype relationship. Having type-parameterized nullrefs sounds perfectly acceptable to me, since it will make the tooling simpler anyway. I am also not too worried about the size impact for table initialization since a sequence of identical parameterized nullrefs sounds like it will compress easily.

RossTate commented 4 years ago

I believe I've now realized another problem with the "any" notion of anyref (as opposed to crossref or extref or what not). This last meeting was super helpful in catching me up on WebAssembly more broadly, and in particular @aheejin's great talk was a great introduction to the plan for exception handling. But I noticed something: exnref is heap-allocated memory that has to be collected. Now, due to the nature of exceptions, exnref cannot form cyclic data structures (except through higher already-cyclic constructs as noted in WebAssembly/exception-handling#85). As such, you can use reference counting to track exnref. Or at least you would like to...

Unfortunately, there seems to be a problem when exnref is a subtype of anyref. I'm guessing y'all aren't planning on reference-counting all anyrefs, given that they're expected to be able to reference components in cyclic data structures. So, if exnref is a subtype of anyref, that means you're not going to be counting all exnrefs. Even weakening exnref to be just convertible to anyref is problematic for similar reasons. And, yes, this is addressable by heavy-handed machinery (using a bit flag in the pointer to distinguish cyclic and reference-counted anyrefs, and/or using an ∞ count when an exnref is explicitly converted to an anyref), but given the lack of real use cases for exnref <: anyref all that machinery (and overhead?) seems not worth it. In general, it seems like reference types that could otherwise be guaranteed to be reference-counted should not be convertible to anyref.

tlively commented 4 years ago

So, if exnref is a subtype of anyref, that means you're not going to be counting all exnrefs

Can you dig into that a bit and explain why exnref being a subtype of anyref means that some exnref values can't be reference counted?

RossTate commented 4 years ago

Because then my code can hand off an exnref to some other code that takes an anyref, and that other code will have no idea that it needs to reference-count that anyref. So either exnref is collected through some means besides reference counting, or all code reference-counts all anyrefs (or at least always goes through the hassle of branching on a bit flag indicating whether the anyref is reference counted).

rossberg commented 4 years ago

@RossTate, the possibility of reference counting is a fallback for engines that do not have GC, e.g., ones that do not run within JS and neither intend to implement the GC proposal. An engine that already incorporates GC refs wouldn't be using reference counting for exnrefs.

RossTate commented 4 years ago

We could debate that detailed point, but I think the higher-level point stands regardless: there is utility in enabling engines/programs to keep internal and external references separate.

rossberg commented 4 years ago

I'm not sure what you have in mind. Exnref's can be passed to the host like any other reference. In a JS environment there is not much choice but to manage all of them with proper GC.

tlively commented 4 years ago

I could imagine a non-JS engine that would want to allow users to disable its GC support (maybe to reduce the code size footprint and memory usage in an embedded context). Such an engine would want to keep as much functionality enabled without GC as possible, so it might very well choose to use reference counting for exnrefs, and for simplicity it might want to use reference counting for exnrefs even when GC is enabled. I don't think this is an outlandish scenario, and I have yet to see a use case for the subtype relationship between exnref and anyref, so I support decoupling these as well.

rossberg commented 4 years ago

@tlively, exnrefs can point to GC refs and vice versa. Once you have GC refs, exnrefs can indeed participate in cycles, unlike without. So no, a heterogeneous implementation would be very outlandish.

lukewagner commented 4 years ago

After some follow-up discussion with Lars and others, I'd like to drop my earlier objection and say that I'm fine with the proposal as-is.

In my first comment, I observed that, if anyref ends up not supporting direct downcast to all subtypes, anyref wouldn't be the "universal representation" for polymorphic languages and this drops the constraint of having funcref be a subtype of anyref. However, type imports also want a single supertype (so a module can just import any T <: any, without excluding or requiring boxing of function references). Thus, even if we made funcref not be a subtype of anyref in this proposal, I think we'd add the subtyping relationship soon after and the churn created by modifying this proposal (messing with ref.null etc) at this late stage, when we're otherwise ready to ship, would have been unnecessary.

(I do think it would be valuable to have some value type that can simply hold an unboxed raw code pointer (without any fancy thunky optimizations), but, independent of whether funcref is a subtype of anyref, funcref is not a great candidate to be this raw-code-pointer type due to it already being, in the MVP, a closure over an instance and doubly so with func.bind.)

binji commented 4 years ago

OK, let's try to move this forward. It sounds like at least for Luke, Lars and others there's less incentive to change the proposal. Let's discuss this at the next CG meeting.

RossTate commented 4 years ago

In preparation for the meeting, here is a summary of my arguments for removing subtyping:

After 3 months, we still have no compelling or thorough example of how this subtyping would be useful.
According to https://github.com/WebAssembly/gc/issues/85#issuecomment-604451780, the primary purpose of anyref is to avoid making assumptions about what a specific reference type in situations where you can't or don't need to know. Having funcref be a subtype of anyref is making an assumption about that reference type, contradicting anyref's primary purpose to represent external references. This means that WASI cannot assume all anyrefs are capabilities, and browsers cannot assume all anyrefs are JavaScript/DOM values, even when those are the only external reference values, despite those being the primary use cases for anyref.
Subtyping is notorious for significantly and unpredictably complicating every aspect of language design and implementation. The proposal has made a bunch of permanent design choices with regards to subtyping in general, but without any actual programs using subtyping, we have no way of knowing if these were the right or wrong design choices. In fact, having myself actually worked out in-depth how various languages might take advantage of subtyping in WebAssembly, I found that the wrong design choice was made in critical places.
The only known low-level language technology that is able to eliminate all() superfluous run-time casts and bounds checks is unfortunately also known to be incompatible with the specific choices of subtyping rules proposed here. So this subtyping might forever preclude WebAssembly from doing something it actually* has a compelling motivation for. (For example, the only known practical tech for supporting the more efficient compilation in https://github.com/WebAssembly/multi-value/issues/42#issue-584523216 is incompatible with the proposed subtyping.) For a sense of scale, we found that superfluous casts just on this due to dynamic dispatch (so not counting casts due to weak support for arrays) would add over 100 million casts in each execution of the DaCapo benchmark suite, averaging 4 casts every microsecond (assuming casts had no overhead). At this point it is hard to anticipate what the performance overhead of these added casts would be.
The proposed subtyping needlessly restricts the implementation of WebAssembly constructs in ways that we have yet to measure the impact of.
No comparable technology has anything comparable to the proposed subtyping rules, including even technologies that had the same "unifying" goal the proposed subtyping rules are meant to help achieve.

Lastly, regarding pragmatics, this change is easy to implement: comment out the implementation of is_subtype and replace it with a call to is_equal. Also, we can always add subtyping later if we ever discover it to be useful (in which case you can simply uncomment your code if you've already implemented this feature). There will be work for the spec, but that is something I can take on rather than burden y'all with.

RossTate commented 4 years ago

I think, fundamentally, the argument for the proposed subtyping is based on a vision for anyref that has been neither fleshed out nor evaluated to a sufficient extent to warrant committing WebAssembly to and that is not relevant to the reason people want this proposal which is just external references.

binji commented 4 years ago

Lastly, regarding pragmatics, this change is easy to implement: comment out the implementation of is_subtype and replace it with a call to is_equal.

I think a lot of us are stuck on this point currently. In the meeting today @rossberg mentioned how much work this would require to the spec, and mentioned that the work required for downstream proposals is unclear. I think it would be valuable for us to try our best to evaluate what kind of changes would be required for proposals that depend on this change: at least exception-handling, bulk-memory, function-references, type-imports, interface-types, wasm-c-api, and WASI AFAIK.

dtig commented 4 years ago

Adding a note here for an explicit notification, following up from the meeting today as well as https://github.com/WebAssembly/meetings/issues/529, please fill out this form to discuss this in a meeting.

RossTate commented 4 years ago

Sounds like a good plan. I'm happy to help with working out the down-the-line impact removing the proposed subtyping would have on proposals. It might be worth having leaders of potentially-affected proposals initiating an issue in the respective to discuss the topic, starting with a summary of the major concerns.

tlively commented 4 years ago

A small question about details: If we have type-parameterized nulls, is there any reason to have separate type-parameterized null types or can null types be removed completely?

The discussion in this thread assumes that there would be type-parameterized nullref types, but after discussion with @aheejin I don't see why they would be necessary.

aheejin commented 4 years ago

Oh no what I meant was we need type-parameterized null values (+ along with instructions to make them), not type-parameterized null types, if we remove subtypes.

tlively commented 4 years ago

Right, I agree that we only need the values. But so far this discussion seems to have assumed we would have both the values and the types, so I wondering if we are missing something that makes the types necessary as well.

"you'd need to distinguish different null values and types"

"I also mentioned here that we were finding, contrary to our original expectations, that nullref should be parameterized"

RossTate commented 4 years ago

Good question(s). For this proposal, there is no need for a nullref type. funcref.null could simply be a funcref. If you do subtypes right, we can always refine the type of funcref.null to be a nullref type later.

Arguably, the null value instructions aren't necessary either. I imagine the null run-time value(s) (for funcref and anyref) are really just necessary because the design of wasm assumes everything has a default value. Given that upcoming proposals will be violating that assumption, even with importable reference types (which anyref-as-externel-references essentially is), at some point y'all will need to address that.

But putting that thought aside, since there's no way to convert funcrefs to anyrefs, there's no need for the null run-time value(s) to be tagged with their respective type. That is, even if you have separate instructions funcref.null and anyref.null, these could both just return the 0 address. Or they could return different values. No well-typed program would be able to tell the difference. (The JS API, on the other hand, might specify that the values of these instructions must both convert to JS's null upon crossing the JS-wasm boundary, but that's an external convention rather than a wasm specification and still doesn't require them to be represented the same inside wasm.)

(The long answer is to lay out the possibilities for y'all without pushing for a specific one.)

aheejin commented 4 years ago

@RossTate

But putting that thought aside, since there's no way to convert funcrefs to anyrefs, there's no need for the null run-time value(s) to be tagged with their respective type. That is, even if you have separate instructions funcref.null and anyref.null, these could both just return the 0 address. Or they could return different values. No well-typed program would be able to tell the difference.

I think we should have parameterized nulls, because we can't allow this in validation:

(func $foo (local $a exnref)
  (local.set $a (funcref.null))
)

tlively commented 4 years ago

@aheejin that type checking would imply that all the nullrefs inhabit both exnref and funcref, which would get us back to having a nullref subtype with a uniform representation. That's explicitly what we aren't requiring. @RossTate is saying that as an implementation detail engines could choose to represent both kinds of nulls as the 0 address, but that wouldn't make them interchangeable in the type system.

tlively commented 4 years ago

@RossTate It makes sense that we could make the nulls separate types in the future if necessary, but are you aware of any future features that would make it necessary?

RossTate commented 4 years ago

I'm not sure there will ever be a need for a nullref type corresponding to specifically exnref or funcref since it is unlikely those will ever be used to represent surface-level types. I was going to say that we will need it to represent Scala's near-bottom type Null, but I just realized there's a better way to do that. (Need to update our own GC proposal!) So right now I don't know of a future feature that would make it necessary.

lars-t-hansen commented 4 years ago

Lastly, regarding pragmatics, this change is easy to implement: comment out the implementation of is_subtype and replace it with a call to is_equal.

I think a lot of us are stuck on this point currently. In the meeting today @rossberg mentioned how much work this would require to the spec, and mentioned that the work required for downstream proposals is unclear. I think it would be valuable for us to try our best to evaluate what kind of changes would be required for proposals that depend on this change: at least exception-handling, bulk-memory, function-references, type-imports, interface-types, wasm-c-api, and WASI AFAIK.

Since this came up on the call: this is a somewhat substantial change to implementations too, both code and test cases. Without a spec to work from we have to guess precisely what the changes would be, but something like this (spidermonkey is our engine and wat is the text->binary translator):

remove "nullref" as a type -- spidermonkey, wat, some test cases
make ref.null take a type and generate a value of that type -- spidermonkey, wat, lots of test cases
remove the subtyping relationship between funcref and anyref -- spidermonkey, lots of test cases.
remove the subtyping relationship between ref and anyref / create a new base reftype for our GC prototype. This has unclear complexity because we want to keep subtyping for our GC prototype and because there may be work involved at the JS/wasm boundary. It's possible the new base type is just (ref (struct)) so it may not be terrible, but there will be churn here both in code and test cases.

I don't want this to influence the decision per se (it's just engineering) but I do want to shoot down the argument that this is a one-line change in production engines.

aheejin commented 4 years ago

Since this came up on the call: this is a somewhat substantial change to implementations too, both code and test cases. Without a spec to work from we have to guess precisely what the changes would be, but something like this (spidermonkey is our engine and wat is the text->binary translator):

remove "nullref" as a type -- spidermonkey, wat, some test cases

make ref.null take a type and generate a value of that type -- spidermonkey, wat, lots of test cases

remove the subtyping relationship between funcref and anyref -- spidermonkey, lots of test cases.

remove the subtyping relationship between ref and anyref / create a new base reftype for our GC prototype. This has unclear complexity because we want to keep subtyping for our GC prototype and because there may be work involved at the JS/wasm boundary. It's possible the new base type is just (ref (struct)) so it may not be terrible, but there will be churn here both in code and test cases.

I don't want this to influence the decision per se (it's just engineering) but I do want to shoot down the argument that this is a one-line change in production engines.

This all applies to the toolchain side as well, including Binaryen and Wabt. I also don't want this to influence the decision per se, but they are not one-liners.

RossTate commented 4 years ago

Your points are heard, and I really appreciate y'all taking the time the last two days to crunch and gauge how much of an impact the change would have on y'all's codebases!

tlively commented 4 years ago

@RossTate your presentation today reminded me of our discussion on the train back from the face-to-face meeting earlier this year, but I have forgotten the details of the call_indirect issue you were talking about then. Was there a fundamental problem with proper variant subtyping of function references that makes it incompatible with funcref <: anyref subtyping? More generally, are there any known problems that funcref <: anyref will cause or are we only concerned with unknown unknowns? You've mentioned you did work on removing runtime casts that was incompatible with subtyping. Was that incompatible with all subtyping or just funcref <: anyref subtyping? Can you share a link to that paper?

RossTate commented 4 years ago

On the train we talked about how the current plan (of sorts?) for call_indirect can be used by a malicious module to convert an abstract imported type to its concrete type and vice versa, making it possible to, say, bypass restrictions on WASI capabilities and to forge WASI capabilities. Today I talked about essentially the same problem but applied to subtyping. Fundamentally the problem is that a caller and callee can in general have different perceptions of the types of the values involved in a function call (and of the function itself). So far y'all have avoided this problem by having a very limited type system (at least at the register/stack level), ensuring that values have a unique type and consequently avoiding differences in perception between the caller and the callee. But adding subtyping (or type im/exports) changes that picture.

Amongst experts in subtyping, it's generally understood that, for usability, analysis, and optimization purposes, you want to design your language so that differences in perception do not affect the behavior of the program. For example, suppose a program type-checks with both call_indirect $t1 and call_indirect $t2, and suppose both $t1 and $t2 are subtypes of $tf, the type of the function looked up. This means that everything is sound in both cases, however the two variants of the program behave differently in the current semantics.

Looking down the road at type function references, a consequence of this is that, if you give me a ref $t for some function type $t, and I upcast it to funcref and then (conceptually) pass that to call_indirect $t, the indirect call can trap even though the direct call would work. In fact, it will trap if someone actually used subtyping between function types to get you that ref $t. Given that the upcast from ref $t to funcref is the plan for how to share functions between code compiled from safe languages and code compiled from unsafe languages, this mismatch will almost certainly cause many bugs and confusion and headaches, and it's not even clear there will be a good way to fix these bugs.

On the flip side, fixing the semantics of call_indirect to address the above issue will result in horrible performance because it has to do a run-time comparison of a list of structural types. So the unfortunate fact is that call_indirect was not designed to be efficiently compatible with subtyping. As I mentioned on the train, I realized there's an existing technique in other systems that fixes this problem (and the related problem with type im/exports), but that fix is not compatible with the proposed subtyping semantics and I don't want to hold up external references even further. So that's why my recommendation is just to get rid of subtyping altogether for now, so that y'all can have external references very soon, and then we'll develop a cohesive strategy for subtyping (including call_indirect) afterwards.

As for the paper, it can be found here. It talks about how we retrofitted a preexisting 200,000 line optimizing C# compiler to produce typed x86, with only changes to 500 lines of code—a number we later recude to just 100. The previous endeavor on the same compiler required changing 19,000 lines of code. (There has been no other successful attempt to retrofit an preexisting compiler to produce typed assembly. A high-level lesson here is that it's actually harder to make compilers produce typed low-level code than it is to type-check typed low-level code, and y'all are hoping to have a lot of compiler produce (typed) WebAssembly.) The crux of the contribution is that we figured out how to redesign the type system for x86 to be inferable from just metainformation (e.g. memory layouts, function signatures, inheritance hierarchy), so that we could just add a "fill in missing types" pass after the compiler spit out its assembly and meta-information. In other words, we were able to automatically figure out that real assembly code for C#, a rather advanced memory-safe language, was completely memory safe without having to change the compiler to conform to our own conventions (though we did turn off 3 out of 49 optimizations). Unfortunately, we also talk about (in Section 3.8) how we were surprised to discover that null being a subtype of any nullable reference type is fundamentally incompatible with type inference. I'm happy to explain more, since it's not exactly a light paper :)

RossTate commented 4 years ago

(Apologies for the repeated comments. GitHub was down and told me I needed to retry posting. Apparently that was a lie.)

On a separate note, I mentioned in today's meeting that the language's that make heavy use of subtyping rely on using nominal subtyping to be able to efficiently type-check large programs (which is something y'all need WebAssembly to be able to do). To give a sense of the scale of difference this makes, here is a portion of the structural type for java.util.ArrayList, a heavily used class:

$t1 = ref (func (param $t32) (result i32))
$t2 = ref (func (param $t6 i32) (result))
$t3 = ref (struct opt$t3 $t41 funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref $t10 $t11 $t35 $t8 $t1 $t35 $t35 $t39 $t35 $t20 $t13)
$t4 = ref (func (param $t12 $t28) (result i32))
$t5 = ref (func (param $t12) (result))
$t6 = ref (struct i32 $t23 (mut i32) (mut $t24) (mut i32))
$t7 = optref (struct i32 $t30 (mut i32))
$t8 = ref (func (param $t32) (result $t42))
$t9 = ref (func (param $t12 $t28) (result $t28))
$t10 = ref (func (param $t32) (result $t28))
$t11 = ref (func (param $t32 $t28) (result i32))
$t12 = ref (struct i32 $t14))
$t13 = ref (func (param $t32 i64 i32) (result))
$t14 = ref (struct $t3 $t41 funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref $t10 $t11 $t35 $t8 $t1 $t35 $t35 $t39 $t35 $t20 $t13 $t4 $t4 $t5 $t4 $t4 $t29 $t16 $t4 $t4 $t4 $t31 $t17 $t9 $t16 $t4 $t16 $t16)
$t15 = ref (func (param $t7 $t28) (result i32))
$t16 = ref (func (param $t12) (result $t28))
$t17 = ref (func (param $t12 $t24) (result $t24))
$t18 = ref (func (param $t6) (result))
$t19 = ref (func (param $t7 i32 i32) (result $t28))
$t20 = ref (func (param $t32 i64) (result))
$t21 = ref (func (param $t7 i32 $t28) (result $t28))
$t22 = ref (func (param $t7) (result $t28))
$t23 = ref (struct $t30 $t41 funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref $t10 $t11 $t35 $t8 $t1 $t35 $t35 $t39 $t35 $t20 $t13 $t4 $t4 $t5 $t4 $t4 $t29 $t16 $t4 $t4 $t4 $t31 $t17 $t9 $t16 $t4 $t16 $t16 $t41 $t37 $t15 $t15 $t22 $t37 $t37 $t26 $t21 $t19 $t15 $t38 $t25 $t38 $t36 $t2 $t18)
$t24 = optref (struct i32 $t3 $t41 $t27)
$t25 = ref (func (param $t7) (result i32))
$t26 = ref (func (param $t7 i32 i32) (result))
$t27 = array (mut $t28)
$t28 = optref (struct i32 $t3)
$t29 = ref (func (param $t12) (result i32))
$t30 = ref (struct $t30 $t41 funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref funcref $t10 $t11 $t35 $t8 $t1 $t35 $t35 $t39 $t35 $t20 $t13 $t4 $t4 $t5 $t4 $t4 $t29 $t16 $t4 $t4 $t4 $t31 $t17 $t9 $t16 $t4 $t16 $t16 $t41 $t37 $t15 $t15 $t22 $t37 $t37 $t26 $t21 $t19 $t15 $t38 $t25 $t38 $t36)
$t31 = ref (func (param $t12) (result $t24))
$t32 = ref (struct i32 $t3)
$t33 = optref (struct opt$t3 $t41 $t10 $t11 $t35 $t8 $t1 $t35 $t35 $t39 $t35 $t20 $t13)
$t34 = array (packed i16 i16)
$t35 = ref (func (param $t32) (result))
$t36 = ref (func (param $t7) (result $t28))
$t37 = ref (func (param $t7 i32) (result $t28))
$t38 = ref (func (param $t7 $t28) (result))
$t39 = ref (func (param $t32) (result $t40))
$t40 = optref (struct i32 $t3 i32 i32 $t34)
$t41 = ref (func (param $t7 i32 $t28) (result))

The structural type of ArrayList is $t6. Every entry in the above type table is reachable from $t6. Multiple of these types are supertypes of $t6, and that fact will be used every time someone invokes a method on a variable of type ArrayList in the source code. I've already done you the favor of minimizing the type, and for your sanity I did not include the definition of the java.lang.Class, which is part of the structural type of ArrayList because getClass returns a java.lang.Class, because the v-table stores a reference to a java.lang.Class, and because each array stores a reference to the java.lang.Class of its element type. We are also fortunate that Java interface types lower down to the same structural type as java.lang.Object, making this type much smaller than it could have been. But how many classes are there in the Java SDK alone?

So consider how many subtyping checks are made to type-check a Java (pre-generics) program and how many more subtyping checks you will have to do because you also have to do a compile-time check every time someone passes the "this" pointer to a function pointer from the v-table, And consider how many of these checks are done much more quickly in Java (say because java.lang.Object is the top type) that you cannot do so quickly (because java.lang.Object does not translate to anyref) and how long it already takes to type-check a Java program even with those specialized subtyping optimizations that you cannot employ. There is no comparable existing system we can look at to evaluate whether or not this subtyping strategy has a reasonable chance of scaling, and there are no large programs that have been compiled to use this strategy that could be used to measure its overhead. If it turns out to be unrealistic, and we throw out this subtyping strategy, what do we do with the subtyping features that were added solely to support the roadmap we just threw out?

So far the discussion has been predicated on the assumption that this roadmap will work out, but that's a big assumption to make given the lack of evaluation or useful comparison points, and if it turns out to be false then not only do we have a ton of churn to do but we also have a bunch of useless backwards-compatibility constraints to deal with.

tlively commented 4 years ago

Thanks, there's a lot to digest and think about there. I just want to make sure we're on the same page with the examples from the first comment:

For example, suppose a program type-checks with both call_indirect $t1 and call_indirect $t2, and suppose both $t1 and $t2 are subtypes of $tf, the type of the function looked up.

In the current semantics and the semantics currently proposed in the typed function references proposal, function types and function reference types have no subtype relationship with one another, so in your example $t1 = $t2 = $tf and everything works out fine. So I don't think it's possible that "the two variants of the program behave differently in the current semantics."

Similarly:

if you give me a ref $t for some function type $t, and I upcast it to funcrefand then (conceptually) pass that to call_indirect $t, the indirect call can trap even though the direct call would work. In fact, it will trap if someone actually used subtyping between function types to get you that ref $t.

Again, no current proposal allows subtyping between function types, so the underlying type of the function here must be exactly $t and there is no problem.

I believe what @rossberg was saying in the meeting this morning was that if (when) we extend the typing rules to allow subtype relationships between function types, we would also relax the call_indirect rules to allow calls to succeed when the dynamic type of the called function is a subtype of the type annotation of the call_indirect. I believe this is what you want to make both of your scenarios work as desired.

Another way to look at this is that call_indirect is already proposed to accept subtypes of the specified function type, as you would expect, but since there are no subtype relationships between function types, the specified and dynamic types must trivially be identical.

RossTate commented 4 years ago

so in your example $t1 = $t2 = $tf and everything works out fine.

Suppose $t1 = [$ti1*] -> [$to*] and $t2 = [$ti2*] -> [$to*] and that each $ti1 is a subtype of the corresponding $ti2. Then due to the combination of the "Empty Instruction Sequence" and "Non-empty Instruction Sequence" rules, if call_indirect $t1 type-checks at some place in the program then call_indirect $t2 also type-checks at that same place. So in my example these types are not necessarily equal.

in the typed function references proposal, function types and function reference types have no subtype relationship with one another

It's a little hard to say anything definite about the typed function references proposal given that the provided type grammar cannot express the types used in the examples or in even basic rules like func.ref; I believe the constype grammar is supposed to let func have an optional (param t*) (result t*) argument. I can't say how subtyping is meant to be extended to the true grammar, but if it is the case that function reference types have no variance with respect to subtyping, then it is unlikely that they can express many languages with subtyping. Both the JVM and .NET permit variance between methods, and many JVM/.NET languages rely on this, so it's unclear that they would be expressible in WebAssembly without that feature. (Fluent APIs, for example, often use return-type narrowing so that method chains have more precise return types when called on more precise object types.)

if (when) we extend the typing rules to allow subtype relationships between function types, we would also relax the call_indirect rules to allow calls to succeed when the dynamic type of the called function is a subtype of the type annotation of the call_indirect. I believe this is what you want to make both of your scenarios work as desired.

That is what I would want if everyone's implementation of call_indirect is reliant upon caller-callee types lining up exactly and cons-hashed so that a simple pointer-equality check is enough to verify the check at run time. From what I understand, that was already the justification for the current semantics.

Another way to look at this is that call_indirect is already proposed to accept subtypes of the specified function type, as you would expect, but since there are no subtype relationships between function types, the specified and dynamic types must trivially be identical.

The only good reason to add subtyping to a low-level language is variance. Everything besides variance can be addressed by a coerce $t instruction. Because if you do that—if you commit to using coerce instead of having subtyping—you can get rid of type annotations on nearly every instruction (including blocks and loops) and actually be able to type-check wasm programs faster.

So this proposal adds enough subtyping to introduce the problems of subtyping, doesn't add enough subtyping to reap the real benefits of subtyping, and punts all the hard (unsolvable?) problems down the road—despite having had 3 years to figure them out—to the point that it's very hard to evaluate that roadmap because no one really knows what it is.

tlively commented 4 years ago

I see, your examples make sense to me now. Having started reading the iTalX paper, I also see that this is a problem if you are trying to do the deferred type inference approach used in that project. I will have to think more about how we could fit deferred type inference into the WebAssembly LLVM backend and how much functionality it could buy us.

It's a little hard to say anything definite about the typed function references proposal given that the provided type grammar cannot express the types used in the examples or in even basic rules like func.ref; I believe the constype grammar is supposed to let func have an optional (param t*) (result t*) argument.

FWIW, the <typeidx> variant of constype can refer to function signatures as they exist today in the type section, so that's the mechanism for specifying the params and results.

rossberg commented 4 years ago

Too much back and forth in this thread to answer everything individually, but I'll try to clarify a few high-level points (most of which is just reiterating what I said in the meeting yesterday, also see slides for some additional details).

Purpose

The addition of subtyping and anyref is part of a larger design that we ended up splitting into several proposals to be able to ship incrementally, like function references and type imports. Basic subtyping ended up in this proposal mostly for technical reasons (cleanest way for slicing the semantics into increments), so it's admittedly a bit less obvious why we have it. But downstream use cases like type imports or typing of reference equality, as currently proposed, fundamentally rely on both.

Cost

As others have pointed out as well, the change proposed here is not small, because it has a number of design implications on this proposal, as well as various others.

Across all affected implementations, tools, and downstream proposals, I'd estimate it is in the order of several man weeks to man months of work. That is a high cost for delaying a decision that we will have to resolve in a couple of months anyway.

And we would be introducing some permanent design warts on the way. For example, the need for type-indexed null values (which are awkward from a semantic perspective, e.g., because (null T) and (null U) are seemingly different values, but if T <: U they must be treated as if they were the same).

Design risks

For something large and in practical use, there is no alternative to an incremental design approach. So far, Wasm has been successful with that. That means that, even when trying to stay conservative as much as possible, we have to commit to certain design choices at some point. Of course, that always involves some amount of uncertainty and the risk of introducing design mistakes at some step. We had a few ones in the past, but so far have been able to work around them successfully.

It's clear that we'll need subtyping. If not now, then 1-2 months down the road, with function references (which are about ready to be advanced to stage 2 or 3). It seems very unlikely that we are dropping subtyping altogether, given that no concrete alternative has been proposed for all its uses, and there is plenty of pressure to move forward.

The concrete subtyping rules in this proposal are very standard and an absolute bare minimum, they are even orthogonal to structural vs nominal. I haven't seen any convincing evidence that they impose a concrete risk, given the explicit nature of typing in Wasm. (In particular, any form of inference, if desired, will be confined to a tooling stage. That means that it is operating on a different language anyway, with a different type system. It can just as well choose to restrict it or add other annotations in any way it sees fit.)

Uniform reference representation

It's been a basic assumption from the start that heap references support a unified representation in a Wasm engine. That's a very reasonable assumption, too, given that all existing Wasm engines as well as the vast majority of other GC runtimes employ it -- off-hand, I wouldn't know any prominent counter example.

Such an assumption simplifies various design aspects as well as interfacing with an engine (e.g., through the C API). It is essential if you want to support compilation against imported reference types without making arbitrary design decisions about which reference types are allowed.

This does not preclude the later addition of specialised flat/raw/unboxed reference types! But even in their presence, you'd still want the uniform variant. Hence, for the MVP, starting with just that is the obvious choice.

Function subtyping

As one instance of staying conservative, we do not extend subtyping to functions yet. That leaves headroom for different choices downstream. If we find that we can't efficiently implement dynamic type checks modulo subtyping (which is not unlikely) then we impose suitable static restrictions.

One known approach would be to distinguish between subtypable and exact function types (an approach that @RossTate is aware of, since it's used in his own paper). Either way, we obviously need to ensure that the semantics is coherent, i.e., that any subtyping rules are the same at validation, link and run time.

tlively commented 4 years ago

Thanks for laying that all out so clearly, @rossberg!

Basic subtyping ended up in this proposal mostly for technical reasons... But downstream use cases like type imports or typing of reference equality, as currently proposed, fundamentally rely on both.

As far as I can see, type imports only depends on funcref <: anyref insofar as it is desirable to be able to have effectively unbounded imports. I can see that this would be elegant, but it's not clear to me that it is useful. I don't want to litigate this trade off here, but please let me know if I am missing some subtlety that makes funcref <: anyref more important. Also, can you point me to discussion or specification of the proposed reference equality? I haven't seen anything on that.

It's been a basic assumption from the start that heap references support a unified representation in a Wasm engine. That's a very reasonable assumption, too, given that all existing Wasm engines as well as the vast majority of other GC runtimes employ it.

The argument about existing Wasm engines biases heavily toward web engines, which might make different choices about their representations because they already support JavaScript objects. I could easily see a non-web engine wanting to support anyref and funcref but not wanting to support the full GC proposal. If that's a situation we want to encourage, we may not want to limit ourselves to precedents in GC runtimes.

This does not preclude the later addition of specialised flat/raw/unboxed reference types!

True, but if they turn out to be sufficient for real use cases, it would be much better to have only the unboxed reference types than have to duplicate everything to have it both ways. And if real languages need both after all, it would be better to start with the one that is more immediately useful. We don't know any immediate use cases for subtyping relationship, so unless that changes we should start with the unboxed references with no subtyping relation.

But even in their presence, you'd still want the uniform variant!

I can see why for spec elegance, but can you point to a real language that uses this for which an explicit boxing operation would be much more difficult to use?

rossberg commented 4 years ago

@tlively:

Basic subtyping ended up in this proposal mostly for technical reasons... But downstream use cases like type imports or typing of reference equality, as currently proposed, fundamentally rely on both.

As far as I can see, type imports only depends on funcref <: anyref insofar as it is desirable to be able to have effectively unbounded imports. I can see that this would be elegant, but it's not clear to me that it is useful. I don't want to litigate this trade off here, but please let me know if I am missing some subtlety that makes funcref <: anyref more important.

The reason is avoiding bias. If we can help it, don't bake in arbitrary decisions about which types are reasonable as imports and which ones aren't. At the meeting, we already got into the discussion about exnref, and I expect the same to occur for others.

Ultimately deciding for each type separately would just be random design choices akin to premature optimisation.

FWIW, I can imagine use cases for abstracting function types, such as capabilities (we actually use functions as capabilities at Dfinity). Sure, you can always work around that with additional wrapping/converting, but that's an extra cost.

Also, can you point me to discussion or specification of the proposed reference equality? I haven't seen anything on that.

It was in this proposal originally. After we voted to defer it, it was moved to the Future Extensions section of the overview.

Currently a bit orphaned. Not sure whether we want to make it its own proposal eventually or incorporate it in one of the others. But clearly, support for reference equality will be needed at some point.

It's been a basic assumption from the start that heap references support a unified representation in a Wasm engine. That's a very reasonable assumption, too, given that all existing Wasm engines as well as the vast majority of other GC runtimes employ it.

The argument about existing Wasm engines biases heavily toward web engines, which might make different choices about their representations because they already support JavaScript objects. I could easily see a non-web engine wanting to support anyref and funcref but not wanting to support the full GC proposal. If that's a situation we want to encourage, we may not want to limit ourselves to precedents in GC runtimes.

Fair point, but even without full GC, engines implementing the full Wasm API will usually need some form of reference counting scheme for function references, as for others.

This does not preclude the later addition of specialised flat/raw/unboxed reference types!

True, but if they turn out to be sufficient for real use cases, it would be much better to have only the unboxed reference types than have to duplicate everything to have it both ways. And if real languages need both after all, it would be better to start with the one that is more immediately useful.

They both are equally useful. But the boxed one is somewhat more flexible. The only advantage of the unboxed one is a hypothetical optimisation in engines for which there isn't any precedence and a very low likelihood that the current generation of implementations would actually implement it. For an MVP, it makes sense to start with the simpler design, and avoid the warts, open questions, and other costs we have been discussing.

We don't know any immediate use cases for subtyping relationship, so unless that changes we should start with the unboxed references with no subtyping relation.

But even in their presence, you'd still want the uniform variant!

I can see why for spec elegance, but can you point to a real language that uses this for which an explicit boxing operation would be much more difficult to use?

An explicit boxing operation would have to be assumed to involve an allocation. So a compiler would want to avoid repeating it on the same object. Consequently, any object that it needs in boxed form in some contexts, the compiler would want to keep it around in boxed form. But if there is no type that represents a boxed function specifically (as opposed to just an anyref), then it would in turn need a downcast check every time it wants to access it.

IOW, without a type representing the boxed form, your only choice when using that is between allocation overhead (on every injection) vs casting overhead (on every projection).