Open LPeter1997 opened 3 years ago
I am not entirely convinced by this, and this is because implicitly traversing the children only works well in some situations, but not all of them. This means the user needs to be capable to choose when it wants this behavior with enough expressivity to be able to express any visitor without more limitations than it would have while using plain C# (ideally).
To be more specific, two important and common situations make it undesirable for the user to desire the visitor to implicitly visit the children of the visited nodes, or make the implementation impossible if we consider the desired expressivity:
1) The visitor methods do not return void
. ln this case, the user will want to write an expression that gathers the visit values (why specificy a return value otherwise?), and this necessarily implies making the calls inside the function controlled by the user, as part of the behavior. Example:
public partial class PrettyPrinterVisitor {
//...
public string Visit(Expression.Binary expr) => $"(binary: {Visit(expr.Left)} {expr.Op} {Visit(expr.Right)})";
}
Here, traversing implicitly after (or before) this function is not adequate: the user already indicates what children to visit and how, in a natural way that allows for efficient combination of the results.
2) I feel like in many passes the user will be searching for some subtrees in particular. If we're in a function node, maybe we only need the name and then we want to bail. In how many cases does the user wish to always visit all the children of every node visited?
Furthermore, we risk introducing silent performance degradations, as if the user naturally and intuitively specificies they desire for the children to be visited (as is natural in a visitor and expected in a recursive function) some children (or all of them) would be visited twice.
So, one thing clear, we need the user to specific explicitly they want us to visit the children of a node for them, like this:
// Function definitions are order-independent, define them
protected void Phase1(FunctionDef def)
{
// TODO: Define function symbol
// We need to manually visit every subnode of FunctionDef here
VisitChildren();
}
This need already solves the order and other details, so no need for VisitChildren.After
or anything similar. I feel like this is overengineered. In my opinion, the only alternatives that don't harm expressivity and are intuitive are the following ones:
Base class: intuitive, but as you said, it introduces a super-class. I don't think it is that horrible, personally. It could be stated in the docs that you can optionally use an empty abstract base class to preserve the default functions alongside your custom ones, with the option to call the default from your one, like this:
protected void Phase1(FunctionDef def)
{
// TODO: Define function symbol
base.Visit(def); // Very intuitive: delegate behavior to `base` = follow the default behavior = visit the tree for me
}
This has the advantage of using the same implementation rules (implement the ones not written already), as the abstract base class is empty.
Question: what if the visit methods do not return void
? What does base.Visit(...)
return then?
DefaultBehavior
is better?), this introduces the problem that you will be always generating a copy of all method even if it is not necessary, unless you add an annotation (GenerateDefaultBehaviorMethodsAttribute
?) to control this, introducing even more complexity. But you would avoid inheritance, if that is important.In summary, I like the base class alternative as it adapts elegantly into the streamlined generation rules we already have: the methods not written are provided. A mechanism of the language itself, which is already implemented in CSC for us, allows the user to delegate control to the default behavior taking care of all the details, such as order and expressivity. However, this proposed solution is only well defined for the case where the return type is void
.
Is your feature request related to a problem? Please describe. I'm trying to formulate in my head a feature that would make it implicit to visit sub-nodes in the visitor. For example, let's say I'm writing a symbolic resolver, I'd need 2 phases:
Right now, I could write something like so:
The thing is, we duplicated the default visitation behavior 2 times (once in
Phase1(FunctionDef def)
and once inPhase2(VariableDef def)
)! And this could be worse for more complex trees. Currently, we could solve this by having a separate visitor base-class and just callbase.PhaseX(y)
at the appropriate places. I have some concerns with this (with various degree of importance):Describe the solution you'd like I'd somehow want to annotate, that I need the default behavior injected, even when I don't have a base class.
Describe alternatives you've considered I have thought a few alternatives.
Generate a base implicitly, if there isn't one. This only partially solves the problem. It will make it possible to fall back to the default behavior, but the call would still be explicit.
Generate a
[MethodName]_Fallback
for all methods with the default behavior. This slightly improves on the previous alternative by eliminating the need of base-classes at all. The explicitness still remains.Tell in
[SyntaxTreeVisitor(...)]
, that the default behavior should be called. We could even make multiple options, likeVisitChildren.Before
,VisitChildren.After
andVisitChildren.Explicit
that would annotate when to visit children (before the custom visitation, after it, or the user will call it explicitly). These could be overriden case-by-case, just like theReturnType
property. This would eliminate all explicitness. Note, that implementing this might be hard, error-prone, or straight up impossible. (see next section)Have an option to generate method calls to simple leaf functions, that are called inside the default-generated visitor methods. Ordering could be maintained with the above idea. An example usage would be:
For custom positioning of the leaf-call, we could still override the visitor methods themselves. For the trivial beginning and end cases, the idea from alternative 3 could be used.
Problem with the third alternative While the third alternative - at least to me - looks like the neatest and most comfortable option, I am not yet sure, if it's implementable. We would need to essentially inject an extra call in the user-defined functions, but Source Generators can't modify existing code. I have not yet found any sensible way to do this yet.
A short summary I feel like what I'm trying to do is trying to factor out the structural walk from the action we want to perform on the node. There might be a better way to do this, I might have just not figured it out yet. Many times I need to walk the tree, but I don't want to write the tree-walking part, as that seems like another responsibility, that does not belong to the thing I'm doing.