WebAssembly / component-model

Repository for design and specification of the Component Model
Other
897 stars 75 forks source link

Make function types more general and symmetric #356

Open rossberg opened 1 month ago

rossberg commented 1 month ago

Currently, the grammar for function types is as follows:

functype      ::= (func <paramlist> <resultlist>)
paramlist     ::= (param "<label>" <valtype>)*
resultlist    ::= (result "<label>" <valtype>)* | (result <valtype>)

All parameters and results must be named, except a singleton result.

There is broad variety across programming languages in whether and how they allow/require/distinguish named vs unnamed parameters, as well as unnamed vs unnamed results, at a function's def site, use site, or both. The current design is rather specific in that regard and somewhat biased. From purely an interface perspective, there is no reason to treat parameters differently from results, or allow omitting names in some cases but not others. And at least for some languages, it would be useful to make explicit which components have "proper" names and which ones haven't, since that enables more idiomatic bindings.

See here for more discussion.

There are several degrees to which the grammar could be generalised:

paramlist     ::= (param "<label>" <valtype>)* | (param <valtype>)
resultlist    ::= (result "<label>" <valtype>)* | (result <valtype>)

or

paramlist     ::= (param "<label>" <valtype>)* | (param <valtype>)*
resultlist    ::= (result "<label>" <valtype>)* | (result <valtype>)*

or

paramlist     ::= (param "<label>"? <valtype>)*
resultlist    ::= (result "<label>"? <valtype>)*

I'd suggest one of the latter two, which make names uniformly optional, and then specify a canonical scheme for synthesised names in contexts/bind-gens that need them, for example, "_1", "_2", etc. based on position. This would apply symmetrically to parameters and results.

lukewagner commented 1 month ago

Personally, I like the symmetry in the abstract, and I can think of a few times where, when writing a WIT interface, I feel like I'm being forced to add a parameter name that adds no value (e.g., handle: func(request: request) -> result<response, error-code>). I do worry, though, that the additional degree of freedom invites more subjective stylistic variation (some folks are going to want to give everything a parameter name, others are going to want to leave them off by default). E.g., WASI would need to establish a style guideline on this with the criteria for named-vs-unnamed.

That being said, looking at all the WIT interfaces I'm seeing being written in practice, I basically never see use of non-empty (result "label" <valtype>)+. I expect most folks don't even know it's possible and default to defining a record that is returned, leading to the stylistic question of "when should you return a record vs. use multi-named-return?". This has made me wonder whether we should actually lean into the asymmetry harder and deprecate multi-return, so that resultlist ::= ϵ | (result <valtype>).

I can see arguments for both cases; it makes me think that perhaps the current state of the proposal is stuck in a sort of uncanny valley between fully embracing symmetry or asymmetry, and so we should shift to one side or the other. But which one I'm not sure. I'd be interested to hear more thoughts on this!

oovm commented 1 month ago

What confuses me is whether the return value with label is a tuple class or an anonymous class.

If it is a tuple class, then you can actually express the real tuple<f32, f32>

If it is an anonymous class, are (a: u32, b: f32) and (b: f32, a: u32) equivalent?

lukewagner commented 1 month ago

As it is currently, since the string names of params and results are part of the function type and thus part of function-type-equality, the latter two are distinct (when used as a function's results), and so you'd think of them more like a record type.