WebAssembly / gc

Branch of the spec repo scoped to discussion of GC integration in WebAssembly
https://webassembly.github.io/gc/
Other
997 stars 72 forks source link

Add `(sub open ...)` and make it mandatory #423

Open tlively opened 1 year ago

tlively commented 1 year ago

Instead of using final? in the syntax, use ext ::= open | final. Implements the design discussed in #413. Update the interpreter, the abstract syntax, the binary and text formats, and MVP.md.

tlively commented 1 year ago

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

tlively commented 1 year ago

A somewhat radical late-stage proposal, but my experience with this PR tells me that it would be feasible: what if we changed "final" to "closed", so we had (sub open ...) and (sub closed ...). Alternatively, "final" and "nonfinal" or "final" and "extensible" would work. It's just slightly annoying to use "open" without also using its obvious antonym.

rossberg commented 1 year ago

I agree that the asymmetry is displeasing.

Open and closed have are rather overloaded terms wrt to types. In particular, a closed type is normally understood as one that doesn't have any free names/variables in it, and indeed, the spec already uses it that way.

Final on the other hand is the established terminology for this feature, and way clearer. Unfortunately it has no good antonym, and nonfinal seems pretty clumsy as a keyword.

Would folks find it less confusing if there wasn't an extra attribute that can be omitted separately at all, but we just have two alternative keywords for subtyping, say sub and sub_final? And omitting that defaults to the latter?

bvisness commented 1 year ago

I think that would more or less put us back where we started. I'm fine with the terminology in this PR even though open and final are not perfect opposites. I would be ok with final and nonfinal too - the verbosity doesn't bother me at this point.

bvisness commented 1 year ago

Further bikeshed options: ext and noext, sub and nosub (lol)

tlively commented 1 year ago

My favorite alternative pair so far is "final" and "nonfinal", but I'd also be fine going with "final" and "open" as in this PR.

rossberg commented 1 year ago

Well, nosub would be misleading, because it generally is a subtype, it just is a final one. Thus my suggestion to call it sub_final (because finalsub puts the primary thing last and hence feels odd to me). But honestly, the more I think about all the alternatives, the more I think we should just leave it as is...

rossberg commented 1 year ago

Yeah, nonfinal also seems odd, since it makes it sound as if the special case (final) was the normal case.

tlively commented 1 year ago

Yeah, nonfinal also seems odd, since it makes it sound as if the special case (final) was the normal case.

I think this is a feature, not a bug. Since the MVP type declarations expand to "final," I think it is helpful as a mental model to think of it as the normal case.

rossberg commented 1 year ago

It is the normal case for types that do not have (non-trivial) supertypes (thus the shorthand), but it is not the normal case for types that actually need to write out sub. This discrepancy probably is the root of the discomfort that some have. But I've come to think that, unfortunately, any attempt to square that circle is not gonna be better overall.

eqrion commented 1 year ago

I agree that the grammatical asymmetry of "final | open" is a little unfortunate, but it seems to be the least objectionable to folks here. The exact wording to use isn't as important to me as avoiding the surprising inversion when you add a '(sub)' around a type, and I like that this PR addresses this.

So I would vote that we just go with the grammar in this PR, and go with something that's an improvement over the status quo.

rossberg commented 1 year ago

@eqrion, I guess I'm still a bit confused about what exactly the surprise is β€” when you add a sub, you can subtype it further. Why is that not what you'd expect by default?

bvisness commented 1 year ago

To readers of the text format, I think the word sub simply seems to indicate a list of super types. I read it like "subtype of foo". It doesn't appear to say anything about extensibility. So, once you know sub exists, it's natural to think that (type (struct (field i32))) is an abbreviation for (type (sub (struct (field i32)))). But of course this is not actually true, because the former type is final while the latter type is not.

Also, the fact that (type blah) is final today implies to unfamiliar readers that types are final "by default". Arguably this is literally true, since most types will probably continue to be encoded with the shorter binary encoding and therefore be final. It's therefore surprising that a sub that says nothing about extensibility disagrees with that "default".

At the end of the day, an empty sub looks like it should be unnecessary, but it is not, and that's surprising to me. I think this PR clears it up nicely.

titzer commented 1 year ago

I agree with @rossberg that we should avoid the use of open and closed, because those would become confusing in the presence of type imports (the type variables Andreas alludes to).

eqrion commented 1 year ago

I agree with @rossberg that we should avoid the use of open and closed, because those would become confusing in the presence of type imports (the type variables Andreas alludes to).

Ah, I missed that this argument would apply to open too, not just closed.

Thinking about the text format here though, the open or closed keywords would be attributes on (sub) not (type), which I think would make it clear that this is modifying the subtyping rules of the type, not saying anything about type variables. In a future with type variables, we could add open or closed keywords on another part of the AST easily, and I don't think it would be confused.

rossberg commented 1 year ago

@eqrion, to clarify, "open" and "closed" types are meta-level notions, not features that would arise as keywords. The spec defines – as is common practice and basic standard terminology – a "closed type" to be one in which all type indices have been substituted by their definition, and that applies to all sorts of syntactic classes of types. "Open" is less commonly used in that context (e.g., the spec does not), so would be okayish, but it would still give rise to the notion of "closed open sub types", which could be confusing. I could probably live with that, but avoiding it would be preferable, especially when the motivation for it is somewhat borderline.

@bvisness, I could see that point if final was meaningful without sub, but it is really just an attribute on it. I suspect that requiring an extra keyword for all normal uses of sub will cause way more accumulated surprise (and annoyance) to more users on the grand scale of things, especially since there are very few use cases for ever caring (in Wasm) about finality for a type that falls outside the shorthand. But fine by me.

A keyword for "non-final" should ideally be (1) short, (2) unsurprising, (3) not conflicting with other notions. Personally, I still think that the "empty keyword" is the best candidate, especially after I made various searches for antonyms of "final" or "complete" and synonyms of "open" or "extensible" β€” nothing compelling comes up. Choices like ext or open are long distance second places.

eqrion commented 1 year ago

@bvisness, I could see that point if final was meaningful without sub, but it is really just an attribute on it.

I think if you come at it from first knowing the abstract syntax tree the spec uses, it makes sense. But having explained it to multiple new-users within SpiderMonkey, it has always caused confusion along the lines that @bvisness has said.

I suspect that requiring an extra keyword for all normal uses of sub will cause way more accumulated surprise (and annoyance) to more users on the grand scale of things, especially since there are very few use cases for ever caring (in Wasm) about finality for a type that falls outside the shorthand. But fine by me.

I think there's a valid argument against this for verbosity reasons. But again from my actual experience teaching others, being explicit about this is less surprising. It's difficult to compare tradeoffs in 'surprising' with 'verbose', but I'd personally prefer 'verbose'.

A keyword for "non-final" should ideally be (1) short, (2) unsurprising, (3) not conflicting with other notions. Personally, I still think that the "empty keyword" is the best candidate, especially after I made various searches for antonyms of "final" or "complete" and synonyms of "open" or "extensible" β€” nothing compelling comes up. Choices like ext or open are long distance second places.

If the keyword open must be avoided, I'd be fine with final|ext|extensible or final|nonfinal. GC type sections are already super verbose and difficult to read, adding ext or even extensible will not really make this significantly worse in real world code or samples.

tlively commented 1 year ago

Yes, "nonfinal" is clumsy as a keyword, but it is also completely obvious that it is the counterpart to "final" and it is not overloaded with any other meaning. @eqrion's experience that that status quo is confusing to newcomers aligns with my experience as well. Could we all live with "final" and "nonfinal"?

titzer commented 1 year ago

I generally don't have too strong a preference for considerations in the text format, but I do find that when the text format differs from internal engine representations, it's confusing juggling two concepts.

So I would prefer that the text format use a convention that is likely to be reflected in engines/implementations/the spec, and AFAICT "final" and "nonfinal" fit that criteria well. Those have the added advantage of closely mirroring final classes in, e.g. Java, which is a concept that many people already know.

rossberg commented 1 year ago

@titzer, I believe we have already converged on keeping final. The open question is about the inverse keyword, and whether we require one at all. For more context, we do not have nonnull or nonmut. Not having to spell out nonfinal for basically every subtype declaration would mirror the likes of Java even more closely.

@all, considering null and mut made me notice that we should maintain overall consistency for flag-like syntax as well. So here is a suggestion for a compromise that achieves that: we introduce explicit nonfinal, nonnull, and nonmut keywords, but in all cases it can be omitted as a shorthand, defaulting to the non-version.

I acknowledge that this isn’t exactly what some folks asked for, but so is the nature of a compromise under conflicting goals. You can be explicit, e.g., ignore the shorthand when explaining the concepts, but we also keep the syntax more consistent at large, and don't bother experienced users with nuisance. WDYT?

eqrion commented 1 year ago

Talked about this internally a bit and this is how we're feeling right now:

We don't think this addresses the confusion we've seen around the default flag for an empty 'sub' being different from when you omit 'sub'. And that's really the main motivation for a change here.

But we also don't think this is worth the continued energy discussing this, and it would at least be useful to have a 'non-final' (would it be possible to hyphenate it?) keyword we could use when teaching new users about this.

We don't have any opinion on the other inverted keywords and would be fine if they are added or left out.