melt-umn / silver

An attribute grammar-based programming language for composable language extensions
http://melt.cs.umn.edu/silver/
GNU Lesser General Public License v3.0
58 stars 7 forks source link

Simplifying Silver's possibly-decorated type inference semantics #751

Open krame505 opened 1 year ago

krame505 commented 1 year ago

The current state of affairs is that whether a reference to a decorable child or local[^local] gets a decorated type is inferred through a "fun trick" in the type system. This can be confusing[^decinf], and we also make it far too easy IMO to un-decorating a tree that has already been decorated. Since we want to make only decorating something in one place the new normal, one should need to be explicit by writing new when they actually do wish to copy a tree.

[^local]: By local I really mean local or production attribute in this writeup. [^decinf]: Citation: @remexre complaining about "decoration inference", which is really just part of the type system and not its own thing, but I digress.

Background

As a quick refresher, a type is decorable when it is:

The "base type" of a decorable type is the nt in Decorated! nt with i for a unique reference type, or else just the nonterminal/type constant.

The way it currently works is that a use of a child or local declared with a reference type gets a possibly-decorated type (ntOrDecType is the name of the production, but it doesn't have syntax or appear in error messages.) A ntOrDecType for base type nt is permitted to unify with nt, Decorated nt with i or Decorated! nt with i, and after unification the type gets replaced with what it was specialized to - the ntOrDecType constructor has an extra hidden type variable to track this. ntOrDecTypes that are still unspecialized after unification default to Decorated nt with {}.

This does pose a problem when we write e.g. x == y where x and y are both possibly-decorated nts, where there is an instance for Eq nt. Since eq :: (Boolean ::= a a) is polymorphic, we never specialize x and y to be undecorated, and get the default Decorated nt with {}, which doesn't have an Eq instance. As a gross workaround for this, I made function application check if the applied function type has any type constraints that are only satisfiable if specialized to an undecorated type, and if so does the specialization.

We also have special semantics for "lexical local variables", which are anything bound in a let/pattern match. Anything bound with a decorated type is permitted to implicitly undecorate, meaning these also get an ntOrDecType type. We do want to disallow casting between Decorated types with different sets of inherited attributes, thus the ntOrDecType constructor also has an InhSet type member to control what the reference set should effectively be when unifying with a Decorated/Decorated! type.

This state of affairs almost always does the right thing, but can lead to some truly perplexing behavior for someone who doesn't understand what is going on behind the scenes. Having separate top-level types for Decorated and Decorated! is also suboptimal, e.g. we need both new :: (a ::= Decorated a with i) and newUnique :: (a ::= Decorated! a with i).

Proposal

First, Decorated and Decorated! are combined into a single top-level type constructor Decorated :: (Uniqueness -> InhSet -> * -> *. Uniqueness is determined by a new kind Uniqueness, which has types ! for unique and ~[^syntax] for not unique. The current syntax can stay as syntactic sugar, i.e. Decorated a with i forwards to Decorated<~ i a> and Decorated! a with i forwards to Decorated<~ i a>.

[^syntax]: I don't love that syntax, I just needed something to use for now. The first idea I had was * but that could cause confusion with kind *. I'm open to suggestions.

Now for any use of a child/local x bound with a decorable base type nt, we infer the type Decorated<u i nt>. This can unify with a unique reference type (as when writing @x), unify with a non-unique reference type, or remain unspecialized (as in the case of an attribute access on x.) Any unspecialized Uniqueness type var would default to ~, and any unspecialized InhSet type var would still default to {}, for the purposes of the the flow/uniqueness analyses.

To get a fresh undecorated version of a child/local, e.g. in a functor attribute equation or strategy attribute rewrite rule, one now would need to write new(x). In unique contexts such as forward and local equations one would typically write @x.

Pattern variables would also be always decorated by default, so you would also sometimes need to write new when pattern matching. Similarly, let-bindings would lose their auto-undecoration behavior and just always give the bound type - this was always a bit weird and mostly an artifact of how pattern matching is implemented.

Discussion

One downside of this is that strategy attribute rewrite rules get a bit more verbose, as one frequently wishes to copy a subtree. That's probably fine, the user should be aware that they are potentially causing a tree to be re-decorated anyway.

Getting decorated types for "data" nonterminals by default is quite unfortunate, e.g. if a production/function has a Maybe<a> child/parameter. The fix for this is probably introducing some sort of non-decorable variant of nonterminal as @remexre has suggested (#556). This wouldn't permit inherited attributes, but probably could still have synthesized attributes with empty flow types. As an optimization the synthesized attributes could be cached on the undecorated term. This would also mean things like the Env nonterminal in Silver/ableC could be used as just Env and not Decorated Env.

Note that deriving equality by propagating compareTo and isEqual would no longer work for data nonterminals, but this was already broken for "record nonterminals" using annotations. We might consider adding a more classic deriving extension for Eq and Ord that just does pattern matching and does compare annotations.

Sometimes we write helper functions for constructing trees, where we don't really want to decorate the parameters in the first place. Needing to undecorate them is already problematic in case the operands contain unique references. The solution to this is probably to actually finish #454, adding concise function syntax which doesn't decorate the parameters.

Overall this would be a pretty significant breaking change for Silver, but would reap significant benefits for future maintainers/users. I am in favor of starting on this as soon as decoration site flow projections (and potentially also forward production attributes) are complete, which would be a good point for another Silver release. It would also be good to finish moving ableC and extensions to use origin tracking before merging another major breaking change.

remexre commented 1 year ago

Initial thoughts:

Decorated :: (Uniqueness -> InhSet -> * -> *

:+1:

Pattern variables would also be always decorated by default, so you would also sometimes need to write new when pattern matching. [...] Getting decorated types for "data" nonterminals by default is quite unfortunate, e.g. if a production/function has a Maybe child/parameter.

Maybe a new kind of pattern that news the value it matched out would make sense, e.g.

case foo of
| Just ~x -> x
| Nothing -> 0
end

Using ~ and ! for this would be fortunate in the sense of mirroring Uniquenesses, unfortunate in that Haskell already has meanings for those in patterns... (introducing extra laziness and extra strictness into a case, respectively).

We might consider adding a more classic deriving extension for Eq and Ord that just does pattern matching and does compare annotations.

Firmly in the "future planning" category, but if we do, it'd be nice to leave the syntax open to specifying a "deriving strategy". Haskell's ended up having a few of these, and in the Rust world, user libraries can provide their own as derive macros.

At least in the Rust world, this has reduced the need for an equivalent to GHC.Generics quite a lot, and probably fits with our extensible-languages goals better.

krame505 commented 11 months ago

Well, there are a bunch of syntactic ambiguities (and potential confusion) if we have both Decorated by itself and the syntactic sugar Decorated a with i as type expressions. I'm thinking the standalone type constructor syntax should be a different keyword like just Dec - such that Decorated a with i would forward to Dec<~ i a>. One would not typically write Dec directly, outside of some utilities that are polymorphic in uniqueness, and maybe some special type class instances.

krame505 commented 11 months ago

Current status: I've made the changes to implement the new semantics, which was relatively straightforward. I'm currently in the middle of refactoring Silver to be able to build itself again.

krame505 commented 11 months ago

A couple notes from doing the refactoring in Silver:

krame505 commented 2 months ago

The above is still pretty much the plan, except that unique references are being removed entirely, and the syntax for the Decorated type is not changing.

krame505 commented 1 month ago

As was discussed offline about a month ago, removing ntOrDecType still leaves the issue of resolving unspecialized InhSet type variables. One option is to introduce another "magic" type that forwards to the empty set of inherited attributes if left unspecialized, but this would require the same grossness as ntOrDecType that we are getting rid of. For now the best solution I can think of is to add a second type checking pass (upSubst2/downSubst2) that just specializes any remaining fresh InhSet type variables to the empty set.