melt-umn / silver

An attribute grammar-based programming language for composable language extensions
http://melt.cs.umn.edu/silver/
GNU Lesser General Public License v3.0
58 stars 7 forks source link

Dispatch productions and tree sharing #812

Closed krame505 closed 3 months ago

krame505 commented 10 months ago

I am considering a pretty major change to tree sharing, related to dispatching productions.

Background

One might currently define an overloaded operator as

production negOp
top::Expr ::= e::Expr
{
  e.env = top.env;
  forwards to e.typerep.negProd(e);
}
production negInt
top::Expr ::= e::Decorated! Expr with {env}
{
  undecorates to negOp(new(e));
  top.errors = e.errors;
  top.type = e.type;
  ...
}
production negBool
top::Expr ::= e::Decorated! Expr with {env}
{
  undecorates to negOp(new(e));
  top.errors = e.errors;
  top.type = e.type;
  ...
}

synthesized attribute negProd::(Expr ::= Decorated! Expr with {env});
nonterminal Type with negProd;
production intType
top::Type ::=
{
  top.negProd = negInt;
}
production boolType
top::Type ::=
{
  top.negProd = negBool;
}

This makes use of tree sharing through unique references, such that the operand e gets decorated with env prior to forwarding, and any other inherited attribute not needed to determine the forward tree are supplied after forwarding.

Problem

The problem with this approach is evident when an extension wants to introduce an analysis involving a new inherited attribute on Expr, and wants to define the inherited attribute on the forwarding production:

production implies
top::Expr ::= a::Expr  b::Expr
{
  top.inline = implies(a.inline, b.inline);
  forwards to orOp(negOp(@a), @b);
}

inherited attribute subs::[(String, Expr)] occurs on Expr;
functor attribute inline occurs on Expr;
propagate subs, inline on Expr;

aspect production negOp
top::Expr ::= e::Expr
{
  e.subs = top.subs;
}
...

Here implies decorates a and b through forwarding with tree sharing, and the extension writer wants the inline attribute to preserve implications. More realistically, implies and inline could be defined in different extensions, one of which is built with knowledge of the other.

To access inline on a, a must first be decorated with subs. This requires negOp to provide an equation for subs, which depends on top.subs[^1]. However, this attribute isn't in the reference set of the reference taken in negOp, potentially leading to hidden, unexpected transitive dependencies when accessing subs on a reference. This could lead to a missing equation error in the following (a bit contrived) example of another extension providing an overload for negOp:

production negThing
top::Expr ::= e::Decorated! Expr with {env}
{
  undecorates to negOp(new(e));
  e.subs = [("a", intLit(42))];
  forwards to e.inline;
}

Here the equation for e.subs has no dependencies; if the forward flow type of Expr is {env}, then this production satisfies the MWDA. However, consider what happens if one computes decorate negOp(var("a")) with {env = [("a", thingType())]}.type, demanding the forward for negThing without supplying top.subs. The equation for e.subs in negOp, supplied before forwarding, takes precedence over the equation in negThing. This would lead to a missing equation error at runtime for subs.

To prevent this sort of problem, the flow analysis currently[^2] restricts the dependencies of any inherited equation supplied to a tree for which a unique reference is taken, when the inherited attribute is not in the reference set. The permitted dependencies for the inherited equation are the dependencies of taking the reference ({env} in this example), plus when the reference has a decoration site flow vertex ("we know where this reference is being decorated"), we also permit the projected dependencies of the attribute being supplied to the reference at its decoration site. This ensures that any access of an inherited attribute on a reference will never lead to more dependencies than are locally known.

This check would flag the equation for subs in negOp would as a flow error, since the reference taken in negOp depends on top.env, and the equation depends on top.subs. This is a rather significant restriction, as there is no way to define inline on implies without re-decorating a. Doing so would lead to exponential forwarding as e.g. inline on negOp is computed through forwarding based on the type of the operand, which could be computed through the forward of implies.

Note that the problem really lies with the negThing production - if it is always the target of forwarding by negOp production, the equation it supplies for e.subs has no effect and should really be flagged as a duplicate equation. To make this work the negThing extension should really define its own version of subs/inline, or re-decorate e, but there is no way of enforcing this at the moment.

Proposal

First of all, we can eliminate unique references as types; the @ operator can be changed to just require its operand to have a vertex type, i.e. be a reference to a child, local, or translation attribute instance, with some ordinary Decorated type. We also can simply require every use of tree sharing to have a decoration site flow vertex type, which greatly simplifies things. This is actually more in line with the simplified analysis we presented in the paper.

A different approach is needed for tree sharing with dispatch productions:

dispatch UnaryOp = Expr ::= @e::Expr;

production negOp
top::Expr ::= e::Expr
{
  e.env = top.env;
  forwards to e.typerep.negProd(e);
}
production negInt implements UnaryOp
top::Expr ::= e::Expr
{
  top.errors = e.errors;
  top.type = e.type;
  ...
}
production negBool implements UnaryOp
top::Expr ::= e::Expr
{
  top.errors = e.errors;
  top.type = e.type;
  ...
}

synthesized attribute negProd::UnaryOp;
nonterminal Type with negProd;
production intType
top::Type ::=
{
  top.negProd = negInt;
}
production boolType
top::Type ::=
{
  top.negProd = negBool;
}

We can declare a new "dispatch" type for productions that are specifically the forward of some particular production(s). UnaryOp is declared with signature Expr ::= @e::Expr, where the operand e is marked as shared. The negInt, negBool, etc. productions explicitly implement UnaryOp, meaning that these productions have type UnaryOp rather than the function type from their signature. Note that UnaryOp only has one shared child, but there are some cases where we may only want to supply attributes and dispatch on some children before forwarding, and have other children be only decorated after forwarding. For example in the Silver compiler, we would have

dispatch AttributeDef = ProductionStmt ::= @dl::DefLHS @attr::QNameAttrOccur e::Expr;

Function application is overloaded for dispatch types, to permit calling them like ordinary functions. Every shared child in the dispatch signature must be supplied as the Decorated nonterminal type, and every non-shared child as the normal type. The application of a dispatch type has decoration site flow vertices for the operands like a normal production, to allow for tree sharing. All applications of a dispatch must be exported by the declaration of the dispatch, and the application must be in a root position of a forward or forward production attribute equation.

This means that a production implementing a dispatch can make use of an inherited attribute on a shared child, and we can check for all applications of the dispatch, that an inherited attribute exists for the child in all these forwarding productions. We can also collect the lhs inherited attribute dependencies of any inherited attributes supplied to these shared children in the forwarding productions, and include these dependencies in all productions implementing the dispatch - this will require a new sort of stitch point.

Note that undecorates to is no longer needed in implementation productions, as these productions can only be constructed as the root of the forward tree. Thus the only way that one of these productions could be decorated twice is if the forward tree was explicitly un-decorated and re-decorated by the dispatching production. I don't know why someone would ever want to do this, but if they did then this shouldn't have an effect on the original forward tree, as shared children would have already been decorated with all the same equations.

Discussion

Using inherited equations from before forwarding

Extension inherited attributes supplied before forwarding in a dispatching production can now be utilized after forwarding, without needing to supply equations in the implementation production. One could propagate subs in negOp, and define inline in negInt, negBool, etc. without defining subs in these productions. In fact, we might want to include a duplicate equation check if an attribute supplied in all forwarding productions for a dispatch is also supplied in a production implementing the dispatch. This would flag the negThing production as doing something problematic:

production negThing implements UnaryOp
top::Expr ::= e::Expr
{
  e.subs = [("a", intLit(42))];
  forwards to e.inline;
}

This can be fixed if the negThing extension introduced its own version of the subs and inline attributes.

Hidden transitive dependencies and overloading

With these changes, the "hidden transitive dependencies" check would still need be needed when an inherited equation is defined on a child/local shared with the @ operator. A similar problem can now also arise if an implementation production attempts to share a child that was shared from the original dispatch production:

production negFoo implements UnaryOp
top::Expr ::= e::Expr
{
  forwards to call(name("neg_foo"), consExpr(@e, nilExpr()));
}

Consider if the inlining extension didn't want to inline inside function calls for some reason:

aspect production call
top::Expr ::= f::Name a::Exprs
{
  top.inline = call(f, a.inline);
}
aspect production consExpr
top::Exprs ::= h::Expr t::Exprs
{
  top.inline = consExpr(h.inline, t.inline);
  h.subs = [];
}

The inlining extension assumes that a.inline in call has no dependencies, and another extension might aspect call and rely on this, but in fact there is a hidden transitive dependency on a.subs. To prevent this, we need to restrict that children of an implementation production that were shared in the dispatch signature cannot generally be shared; they can only be shared as the corresponding child of another production that also implements the same dispatch. For example, one could still write this:

abstract production implicitSynAttributeDef implements AttributeDef
top::ProductionStmt ::= dl::DefLHS attr::QNameAttrOccur e::Expr
{
  e.env = top.env;
  e.downSubst = top.downSubst;
  ...
  forwards to synthesizedAttributeDef(dl, attr, e.monadRewriting);
}

However, there isn't a way to implement negFoo without re-decorating e. This problem is avoided by the alternative version of operator overloading that binds the operands as let expressions:

dispatch UnaryOp = Expr ::= @e::Expr;

production negOp
top::Expr ::= e::Expr
{
  e.env = top.env;
  local eVar::Name = freshName();
  forwards to
    case e.typerep.negProd, e.typerep.negOverload of
    | just(prod), _ -> prod(e)
    | _, just(prod) -> let_(eVar, @e, prod(eVar))
    | _, _ -> errorExpr("Can't negate " ++ showType(e.typerep))
    end;
}

synthesized attribute negProd::Maybe<UnaryOp>;
synthesized attribute negOverload::Maybe<(Expr ::= Name)>;
nonterminal Type with negProd, negOverload;
aspect default production
top::Type ::=
{
  top.negProd = nothing();
  top.negOverload = nothing();
}
production intType
top::Type ::=
{
  top.negProd = just(negInt);
}
production boolType
top::Type ::=
{
  top.negProd = just(negBool);
}

One can then implement negFoo as

production negFoo
top::Expr ::= n::Name
{
  forwards to call(name("neg_foo"), consExpr(var(n), nilExpr()));
}
production fooType
top::Type ::=
{
  top.negOverload = just(negFoo);
}

With this approach, one can write extensions that provide either sort of overload - an extension that overloads negation to do some arbitrary transformation (like negThing), or an extension that only cares about the value of the operand, and shares the original expression. One cannot however write an overload that shares the original expression in an arbitrary context, such as under a lambda expression, at least not without re-decorating the tree.

Silver development plans

Implementing this will be slightly painful as unique references are used in the Silver compiler; we can't remove them until dispatches are implemented and used in Silver.

751 is mostly unaffected, aside from the removal of uniqueness in decorated types. The Dec type constructor would now have kind InhSet -> * -> *, although there is also now less motivation to refactor things to have Dec as a proper type constructor. Making this change would still be in a future 0.6.0 release, after making the above-mentioned changes in 0.5.1, presumably.

796 would be unchanged, I think - this is still a dependency of #751 that could go in an earlier 0.5.1 release.

797 is probably obsoleted by this change, since Decorated types will no longer commonly appear in production signatures, so there is little reason to abbreviate this.

[^1]: Note that implies could have instead added an equation for a.subs - this would lead to the same problem.

[^2]: Well, mostly. I found a couple of bugs in the course of investigating this, where the analysis was being too permissive.

krame505 commented 5 months ago

One slight wrinkle is that there are some cases where we want the implementation productions to have different signatures. For example:

concrete production attributeDef
top::ProductionStmt ::= dl::DefLHS '.' attr::QNameAttrOccur '=' e::Expr ';'
{
  local problems :: [Message] = dl.errors ++ attr.errors ++ ...;
  forwards to
    if !dl.found || !attr.found || !null(problems)
    then errorAttributeDef(problems, dl, attr, @e)
    else attr.attrDcl.attrDefDispatcher(dl, attr, @e);
}

abstract production errorAttributeDef
top::ProductionStmt ::= msg::[Message] dl::Decorated! DefLHS  attr::Decorated! QNameAttrOccur  e::Expr
{ ... }

abstract production synthesizedAttributeDef
top::ProductionStmt ::= dl::Decorated! DefLHS  attr::Decorated! QNameAttrOccur  e::Expr
{ ... }

We can define dispatch AttributeDef = ProductionStmt ::= @dl::DefLHS @attr::QNameAttrOccur e::Expr. However errorAttributeDef also has a list of error messages, and thus can't implement AttributeDef. One could make ErrorAttributeDef its own dispatch signature, but that is a bit awkward.

I think the real answer is that there are two separate features we want here:

  1. The ability to use inherited attributes known to be supplied by the production that forwarded to this one, and
  2. The ability to dispatch to a production that does this sort of sharing in an extensible fashion.

So in cases where we are forwarding to a specific production, but want (1), we really want the ability to mark individual children of a production as shared, without needing to implement some dispatch signature. This actually shakes out fairly naturally in the implementation - all the checks needed for a standalone production with shared children (need better terminology here...) are the same as if it is implementing a dispatch signature.

concrete production attributeDef
top::ProductionStmt ::= dl::DefLHS '.' attr::QNameAttrOccur '=' e::Expr ';'
{
  local problems :: [Message] = dl.errors ++ attr.errors  ++ ...;
  forwards to
    if !dl.found || !attr.found || !null(problems)
    then errorAttributeDef(problems, dl, attr, @e)
    else attr.attrDcl.attrDefDispatcher(dl, attr, @e);
}

abstract production errorAttributeDef
top::ProductionStmt ::= msg::[Message] @dl::DefLHS  @attr::QNameAttrOccur  e::Expr
{ ... }

dispatch AttributeDef = ProductionStmt ::= @dl::DefLHS @attr::QNameAttrOccur e::Expr;
synthesized attribute attrDefDispatcher::AttributeDef occurs on AttrDclInfo;

abstract production synthesizedAttributeDef implements AttributeDef
top::ProductionStmt ::= @dl::DefLHS  @attr::QNameAttrOccur  e::Expr
{ ... }

abstract production inheritedAttributeDef implements AttributeDef
top::ProductionStmt ::= @dl::DefLHS  @attr::QNameAttrOccur  e::Expr
{ ... }

Something else I'm still not totally sure about is how regular decoration-site flow projections could work with dispatching. In the above example, this could allow errorAttributeDef to be eliminated, as this production only exists to decorate e and get its errors:

concrete production attributeDef
top::ProductionStmt ::= dl::DefLHS '.' attr::QNameAttrOccur '=' e::Expr ';'
{
  local problems :: [Message] = dl.errors ++ attr.errors ++ e.errors ++ ...;
  forward fwrd = attr.attrDcl.attrDefDispatcher(dl, attr, @e);
  forwards to
    if !dl.found || !attr.found || !null(problems)
    then errorProductionStmt(problems)
    else @fwrd;
}

Supporting this in the MWDA would be rather annoying, though - we would need to require that any projected inherited attributes depended on in the forwarding production are supplied in all implementation productions, and that the dependencies for these inherited equations in extension implementation productions don't exceed the dependencies for the same in any host-language implementation productions. This would also only work for host-language inherited attributes, as there is no way to enforce that independent extension implementation productions for some dispatch all supply some extension inherited attribute to a child.

This seems like more trouble than it's worth, IMO, as the projected attributes can't be used to determine the dispatch production (to avoid a circularity), so one can always re-write this in terms of two levels of forwarding, with an intermediate production like errorAttributeDef.

krame505 commented 5 months ago

Another issue - sometimes dispatch productions have additional children, and are partially applied where the overload is defined. For example

dispatch Reference = Expr ::= @q::QName;
synthesized attribute refDispatcher :: Reference occurs on ValueDclInfo;

abstract production lexicalLocalDcl
top::ValueDclInfo ::= fn::String ty::Type fi::Maybe<VertexType> fd::[FlowVertex] rs::[(String, UniqueRefSite)]
{
  top.refDispatcher = lexicalLocalReference(_, fi, fd, rs);  -- ???
}

abstract production lexicalLocalReference
top::Expr ::= @q::QName  fi::Maybe<VertexType>  fd::[FlowVertex]  rs::[(String, UniqueRefSite)]
{ ... }

Here it isn't possible for lexicalLocalReference to implement Reference, because the signature doesn't match.

It's a bit inelegant, but the best solution I can think of is to allow dispatch implementation productions to have extra children after the ones in the dispatch signature. Referring to the implementation production would then yield a function with parameters corresponding to the extra children. This would allow one to write

abstract production lexicalLocalDcl
top::ValueDclInfo ::= fn::String ty::Type fi::Maybe<VertexType> fd::[FlowVertex] rs::[(String, UniqueRefSite)]
{
  top.refDispatcher = lexicalLocalReference(fi, fd, rs);
}

abstract production lexicalLocalReference implements Reference
top::Expr ::= @q::QName  fi::Maybe<VertexType>  fd::[FlowVertex]  rs::[(String, UniqueRefSite)]
{ ... }

Here lexicalLocalReference now has type (Reference ::= Maybe<VertexType> [FlowVertex] [(String, UniqueRefSite)].

krame505 commented 5 months ago

Above I stated

Note that undecorates to is no longer needed in implementation productions, as these productions can only be constructed as the root of the forward tree. Thus the only way that one of these productions could be decorated twice is if the forward tree was explicitly un-decorated and re-decorated by the dispatching production. I don't know why someone would ever want to do this, but if they did then this shouldn't have an effect on the original forward tree, as shared children would have already been decorated with all the same equations.

It turns out there is another way that implementation productions can get undecorated - someone can write something like

production negThing implements UnaryOp
top::Expr ::= @e::Expr
{
  top.transform = new(top);
  forwards to ...;
}

Instead, we want to make them write top.transform = negOp(new(e));. To enforce this, I think we want to forbid taking a reference to the LHS of production with shared children.

krame505 commented 5 months ago

To enforce this, I think we want to forbid taking a reference to the LHS of production with shared children.

This raises an issue with default productions. Consider something like

aspect default production
top::Expr ::=
{
  top.transform = new(top);
}

abstract production negInt implements UnaryOp
top::Expr ::= @e::Expr
{
  -- No transform equation
}

Here we still end up undecorating a production with shared children, which could lead to hidden transitive deps as in the original example. So I think we need to check for any default equation that takes a reference to the LHS, that there is an explicit equation supplied for the attribute in every non-forwarding production that has a shared child.

Note that this still doesn't work for closed nonterminals, as e.g. transform and negThing could be introduced by independent extensions. Some more thought is needed here on the interaction between signature sharing and closed nonterminals; we might just want to forbid this form of sharing in closed NTs, since one does not typically use forwarding in these cases anyway.

krame505 commented 4 months ago

I was just discussing with @ericvanwyk the issue of how dispatch productions like negOp (and any productions forwarding to it with sharing) cannot make use of an extension inherited attribute bar supplied to its operand only in the implementation productions, even if all implementation productions supply bar. This is because independently-introduced implementation productions, like negThing in the above example, might not supply the attribute:

production negThing implements UnaryOp
top::Expr ::= @e::Expr
{
  e.subs = [("a", intLit(42))];
  forwards to e.inline;
}

After further consideration, I think the fix is actually to require that any extension implementation productions must forward to a host implementation production, sharing all shared children. This means that if all host-language productions implementing UnaryOp supply bar to e, then all extensions implementation productions would as well, so negOp and anything forwarding to it can rely on the bar equation always being present.

This does disallow productions like negThing, where we don't include the shared child in the forward. If we want to allow this sort of overload, we could add a host language production that permits this:

production unaryTransform implements UnaryOp
top::Expr ::= @e::Expr trans::Expr
{
  forwards to @trans;
}

If we want to make use of bar in negOp, then we also need to define bar on e in unaryTransform. This comes with corresponding trade-offs in what sorts of extension analyses one can write for expressions, as in some cases there may not be a sensible default to supply for some inherited attribute on e in unaryTransform.

A similar approach also makes achieving the "let binding overload" pattern a bit more elegant than described above. One could write a host production

production unaryBind implements UnaryOp
top::Expr ::= @e::Expr impl::(Expr ::= Name)
{
  local var::Name = freshName();
  forwards to let_(var, @e, impl(var));
}

This considerably simplifies negOp, as we don't need seperate overload attributes if we want to support both sorts of overloading. negOp only needs to supply whatever inherited attributes are needed to break the cycle of deciding what to forward to.

production negOp
top::Expr ::= e::Expr
{
  e.env = top.env;
  forwards to e.typerep.negProd(e);
}

An extension that wishes to overload negOp, like negFoo from an earlier example, can just forward to unaryBind:

production negFoo implements UnaryOp
top::Expr ::= @e::Expr
{
  forwards to unaryBind(e, \ n -> call(name("neg_foo"), consExpr(var(n), nilExpr())));
}
krame505 commented 4 months ago

To enforce this, I think we want to forbid taking a reference to the LHS of production with shared children.

Actually, there's a much simpler solution to preventing dispatch forward trees with shared children from being duplicated - just make undecorating an implementation production return (the undecoration of) the tree that forwarded to it. This is essentially the same semantics that we had with undecorates to previously, except that now we always know that a tree rooted in an implementation production is the forward of something, so there is no need to specify this explicitly.

This also means that the above restrictions related to default equations and closed nonterminals are not needed.