dart-lang / language

Design of the Dart language
Other
2.66k stars 205 forks source link

Make specification of desugaring well-defined. #227

Open eernstg opened 5 years ago

eernstg commented 5 years ago

The language specification currently uses the phrase 'is equivalent to' in a number of cases involving two syntactic forms. For instance:

An \ON{}-\CATCH{} clause of the form 
\code{\ON{} $T$ \CATCH{} ($p_1$) $s$}
is equivalent to an \ON{}-\CATCH{} clause
\code{\ON{} $T$ \CATCH{} ($p_1$, $p_2$) $s$}
where $p_2$ is a fresh identifier.

The intention is that the language can be slightly simplified by a process which is often designated as "desugaring", by specifying that certain constructs are just syntactic abbreviations of some other constructs (where the unfolding is allowed to go beyond syntax in the sense that static semantic information like types can be used to determine whether and how to unfold). Having said this, we have then specified the sugared form fully, as long as we can find the specification of the desugared form in other parts of the language specification.

The inherent symmetry of the phrase 'is equivalent to' is not appropriate for this purpose, because it blurs the roles. For instance, "which form is sugared and which one is desugared?", "is this a desugaring relationship at all?", "do we enforce that this relationship has no cycles?".

This issue is concerned with a modification of the language specification to use 'is treated as' rather than 'is equivalent to', and at the same time ensure that we have the correct layering (such that the desugared language is well-defined, e.g., by not having any cycles).

lrhn commented 5 years ago

We should still strive to avoid "desugaring" which introduces significantly new expressions or syntactic forms, or which nests expressions in new ways. That is, anything we add, which is not effectively ignored (like the fresh variable bindings here, which we know there are no references to anywhere).

Examples where introducing new syntax or context has failed us in the past:

 e1 ?? e2   ⟼   ((v) => v == null ? v : e2)(e1)

This was a valid rewrite until we introduced async code, then it became invalid if e2 contained an await.

 x++  ⟼   x = x + 1

This used to be correct, but if read verbatim, it would change meaning when we introduced integer numerals as double literals, if the argument type of x.+ is double. The code x++ is not intended to be equivalent to x = x + 1.0, but by doing a syntactic rewrite, we open the rewritten, newly introduced, code up to further source interpretation.

So, we should make make sure to restrain our syntactic sugar to situations where such problems are unrealistic, or specify it in a way that precludes further changes due to, e.g., inference.

eernstg commented 5 years ago

@lrhn wrote:

We should still strive to avoid "desugaring" which introduces significantly new expressions or syntactic forms

Agreed! We'd get far too many surprising effects if we are not careful. But we do have a number of simple cases (for example, consider C?.foo(42) where C denotes a class; we specify that to be treated as C.foo(42), and I think that's a meaningful way to say "we just don't care about a ? on the type literal in a static method invocation").