WebAssembly / gc

Branch of the spec repo scoped to discussion of GC integration in WebAssembly
https://webassembly.github.io/gc/
Other
1k stars 72 forks source link

Two-Stage Plan for GC #72

Closed RossTate closed 2 years ago

RossTate commented 5 years ago

The GC extension has the tension that a good design will take a long time to develop and a quick design will impose backwards-compatibility constraints we will regret. Thanks to some productive prompting by Adam Klein, I had any idea for how to resolve this tension: explicitly design the MVP so that it can be easily compiled to the long-term solution, and explicitly design the MVP with the expectation that it will be phased out (using this compiler) once the long-term solution is available. Here's a rough sequence of events to illustrate what I mean:

  1. Design of Stage 1 GC extension
  2. Release of Stage 1 with notice that it will become deprecated
  3. Design of Stage 2 GC extension
  4. Implementation of Stage-1-to-Stage-2 compiler
  5. Release of Stage 2 with notice that Stage 1 is now deprecated but with hosts temporarily supporting Stage 1 by using the Stage-1-to-Stage-2 compiler
  6. Update of gc-wasm modules to Stage 2, either by directly compiling source code to Stage 2 (since this will enable far better performance and interop), or by simply using the Stage-1-to-Stage-2 compiler
  7. Remove support of Stage 1 from hosts

This will let us make a quick and dirty MVP without worrying about our decisions having unintended long-term consequences. In fact, I'd say what's already in the current GC proposal has the right ideas but is actually too complicated given this plan - I suspect we'd be able to strip out a bunch of features to get just the essentials for supporting typed GC languages (untyped GC languages will need lower-level features for good performance that the long-term proposal will provide). So I think this strategy would get us an MVP sooner and enable the long-term design to be better.

Questions? Clarifications? Thoughts? (I'm intentionally leaving conjectures about the specifics of the Stage 1 or 2 designs out of the discussion in order to focus on the high-level strategy for now - we can dive into the implications of this strategy on those designs later into the conversation when the appropriate time comes.)

rossberg commented 5 years ago

Hm, that already is our strategy of record. There is MVP.md, which is the attempt to compile exactly such a minimal feature set. It may be more complicated than you'd like (and honestly, then I'd like), but it is not clear what can be taken out without making it effectively useless. In fact, there is a lot of pressure to put in even more stuff.

I suspect we'd be able to strip out a bunch of features to get just the essentials for supporting typed GC languages

I am not aware of any feature included in MVP.md that is not gonna be needed even for strongly typed languages. Can you give some examples? For example, I would love to take out casts and thus the whole business of runtime types, but no relevant compiler will get far without such an escape hatch. (FWIW, the actual problem I see with the current RTT design is that it is too monolithic and high-level for Wasm and too biased towards a certain brand of languages. I think I know how a more low-level mechanism could replace it, but not without introducing existential types.)

RossTate commented 5 years ago

Hm, that already is our strategy of record.

Uhh, where is it stated that you plan on the MVP being no longer supported when v2 comes out? To be clear, I am explicitly stating that v1 should not be a sublanguage of v2. As I said, let's focus on the high-level strategy before going into details of design.

rossberg commented 5 years ago

Oh, okay, I missed that aspect of your suggestion.

The simple answer to that is the fundamental constraint of all Web evolution: Don't Break The Web. That is, you cannot ever remove features from the Web, because that breaks existing Web pages that may not be maintained anymore. So while we could go the route you suggest for experimental features (behind a flag), browsers would never want to officially ship stage 1. But then, what's the use?

RossTate commented 5 years ago

I'm not proposing a plan that will break the web. The final step can be ignored if it happens to be the case that a bunch of websites refuse to update to stage 2 even though they've been put on the slow path by sticking to stage 1. What's really important here is that the wasm engines can stop supporting stage 1.

rossberg commented 5 years ago

What's really important here is that the wasm engines can stop supporting stage 1.

But that's the point: this is a plan for breaking the web. Asking websites to be updated is not a workable path on the Web. You absolutely must assume that there are web pages that won't be updated.

RossTate commented 5 years ago

wasm engine != browser. To support old websites, browser feeds stage 1 gc-wasm to the stage-1-to-stage-2 compiler, and then the result of that goes to the engine. This can likely be streamlined so that the stage-1-to-stage-2 compiler can execute while the code is being downloaded and so on.

rossberg commented 5 years ago

That's not a helpful distinction to make in practice, the support has to be maintained somewhere. Moreover, browsers and their engines typically can't afford anything that regresses their performance, be it for load times, execution, or otherwise. There is a lot of ugly game theory at play.

fgmccabe commented 5 years ago

I am personally not in favor of this approach. In fact, I believe that the GC proposal represents a kind of minimum of what is needed; even for untyped languages. Francis

On Wed, Oct 9, 2019 at 7:55 AM Andreas Rossberg notifications@github.com wrote:

That's not a helpful distinction to make in practice, the support has to be maintained somewhere. Moreover, browsers and their engines typically can't afford anything that regresses their performance, be it for load times, execution, or otherwise. There is a lot of ugly game theory at play.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/WebAssembly/gc/issues/72?email_source=notifications&email_token=AAQAXUA6HJKIXN5YY462FQLQNXWE5A5CNFSM4I6Y6MU2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEAYFOJA#issuecomment-540038948, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQAXUHWV4AXP7KBA2AQ55LQNXWE5ANCNFSM4I6Y6MUQ .

-- Francis McCabe SWE

lukewagner commented 5 years ago

Definitely agreed that we should not plan to ship any feature with any expectation of being able to unship it later.

Horcrux7 commented 5 years ago

I like the idea of a minimal GC step. For example only arrays objects in a first phase.

But the reality is that the developers probably have no time for that. They are busy with other proposals.

RossTate commented 5 years ago

Moreover, browsers and their engines typically can't afford anything that regresses their performance, be it for load times, execution, or otherwise. There is a lot of ugly game theory at play.

Is competition so bad that browsers are heavily competing to be the fastest on websites that are so unpopular that they aren't actively maintained enough to have been updated in over a year? (Not to mention that these websites must have been recently well maintained enough to have updated to gc wasm in the first place.)

@lukewagner, I removed the last step (even though I suspect it will be viable). What are your thoughts on the rest of the strategy?

lukewagner commented 5 years ago

Well, the strategy listed still seems to be about adding features to a stage 1 that become deprecated, which seems like an unfortunate thing to do on purpose. It's hard to consider the strategy in the abstract, though.

RossTate commented 5 years ago

Thanks everyone for the feedback both in this discussion and offline! Here is my (much less radical) revised strategy based on that feedback.

Without baking in language primitives like the JVM and CLR did (which I am not at all proposing), there is no way the MVP will have excellent performance for just about any (gc) programming language. But it doesn't need to have excellent performance to be an improvement for the web and for WebAssembly given the horrible performance of compiling to JavaScript that has only been acceptable due to the huge efforts behind the JavaScript engines. The MVP need only have reasonable performance to give languages good access to the web and to enable reasonable interop with JavaScript and the DOM. This is not to say that reasonable performance is all the GC proposal should strive for, just that for now we should focus on just what is necessary for reasonable performance. At the same time, in our effort to develop the MVP quickly, we need to be mindful that our short-term design should not obstruct a long-term design that can grant the excellent performance everyone is hoping for from WebAssembly. Therefore, I propose a design criteria for the MVP to be minimal imposition of backwards-compatibility constraints especially at run time. For example, if there are two solutions for a use case, one of which is more efficient but forever imposes a backwards-compatibility constraint on the run time, and the other of which is a little less efficient but much more likely to be forwards-compatible, we should go with the forwards-compatible design. The slight efficiency issue is a short-term problem that will be resolved in not too long, whereas the run-time backward-compatibility imposition will never be resolved.

How does that very different design strategy sound?

Horcrux7 commented 5 years ago

Can you give a sample for it? I am unclear what a subset has to do with efficiency.

to enable reasonable interop with JavaScript and the DOM.

This sounds me the major problem of the current concept. The objects from JavaScript and Wasm are not compatible/interoperable.

lukewagner commented 5 years ago

IIUC, that's the existing (and, in general, wasm) plan, so.... +1?

RossTate commented 5 years ago

This sounds me the major problem of the current concept. The objects from JavaScript and Wasm are not compatible/interoperable.

There are varying degrees of interop. One that would be useful for a number of applications is for wasm programs to be able to directly reference JavaScript/DOM objects and vice versa, while also letting the browser's garbage collector detect when these references can be cleaned up (even in the presence of cycles that are easily formed due to the DOM). A greater degree of interop would permit wasm programs to dereference JavaScript/DOM objects, but there are a number of barriers to enabling that (which I believe is the incompatibility you mentioned). So the MVP should address the former but not the latter (and it's not clear that the latter will ever be addressible).

Can you give a sample for it? I am unclear what a subset has to do with efficiency.

Sorry, can you clarify your request? A sample for what? I never used the word subset, so I'm not sure what specifically you're referring to.

IIUC, that's the existing (and, in general, wasm) plan

That's perfectly possible. In previous collaborations I've found it useful to have discussions like these to not only get everyone on the same page about design criteria, but even get everyone on the same page as to how to make design decisions that necessarily tradeoff desirable qualities. That is, it's easy to get everyone to agree on what they want, but it's harder to get people to agree on what should be done when desired goals come in conflict with each other. So I'm using the following example tradeoff as one way to check my understanding, based on this conversation, as to how the group would like conflicts between performance and forwards-compatibility to be resolved:

For example, if there are two solutions for a use case, one of which is more efficient but forever imposes a backwards-compatibility constraint on the run time, and the other of which is a little less efficient but much more likely to be forwards-compatible, we should go with the forwards-compatible design.

With that clarification of intent, still +1 from you?

trusktr commented 4 years ago

Just curious, what would you say is the approximate timeline for WASM GC to land in a browser?

Horcrux7 commented 4 years ago

I think a minimum of 2 years. The first step will be reference types in the browser. I hope this will be available in the next year in the browser without any switches.

tlively commented 2 years ago

The proposal now has a productive implementer feedback loop and is progressing according to the standard proposal process.