do we need parse tree modification routines?

parrt commented 10 years ago

If we allow patterns, we can also allow tree construction from input snippets. Should we add routines to replace parse tree nodes? Ter

sharwell commented 10 years ago

I believe this would be very valuable, but I'm not sure how to approach it yet (at the API level). Any ideas?

parrt commented 10 years ago

A simple fundamental replace child or replace node maybe then build on top with replace based upon xpath and/or tree pattern findall? basic replace node is:

t.getParent().setChild( t.getChildIndex(), u );

parrt commented 10 years ago

maybe we could use:

Trees.replace(t, u); // t.getParent().setChild( t.getChildIndex(), u );
Trees.replaceAll(t, u, "xpath", parser);
Trees.replaceAll(t, u, "xpath", "int <ID> = 0;", pattern, parser); // with tree pattern

We need a method to create a parse tree from concrete syntax like Parser.compileParseTreePattern() but more like

Parser.createParseTree(RULE_decl, "int x = 0;"); // not a pattern but creates parse tree.

parrt commented 10 years ago

Ok, we'll leave for 4.3 after we study. We have to discuss how replacement affects token/text offsets etc...

athafoud commented 8 years ago

Any update on this?

Simply altering the children of the context does not seem to work. The internal Start / Stop Indexes need to be changed also.

I am trying something like this

// Swap children
IParseTree oldClild_2 = context.children[2];
IParseTree oldClild_4 = context.children[4];

context.children[2] = oldClild_4;
context.children[4] = oldClild_2;

OR

context.children.RemoveAt(2);
context.children.Insert(2, oldClild_4);
context.children.RemoveAt(4);
context.children.Insert(4, oldClild_2);

However when I am re-creating the input using a TokenStreamRewriter, the change in the tree is ignored.

parrt commented 8 years ago

Hi. no progress. sorry. yes, all relevant properties of nodes would need to change. we just didn't design them to be easily reorganized. we tend to create new trees as needed.

parrt commented 7 years ago

Sam and I discussed Oct 16. @sharwell noted:

Another interesting thing in Roslyn: the parse trees created by the compiler don't have pointers to their parent nodes. When the consumer calls the equivalent of node.getChild(3), a wrapper node is created and returned to the caller that uses the original node for most operations but does have a pointer to the parent. That way the API has getChild and getParent, but you don't lose the ability to share subtrees following a transformation (if you don't do this, then attempting to share subtrees in a transformation will result in some nodes taking to an old root node, and other nodes taking you to a new root node when you follow the chain of getParent calls)

I replied:

yeah, i’m thinking immutability and something like this reduces headaches and bugs like cycles. reeeeally hard to find those bugs. hmm...otoh, we already have that linked kinda baked in.

Also interesting link to roslyn SyntaxEditor.cs

millergarym commented 7 years ago

@parrt here is an alternative.

I've always found recursively defined trees much harder to use than trees where the structure and nodes are separated. I.e. implement a tree using an adjacency list representation. This way when rewriting a tree you just construct a new tree with the existing nodes.

Below is the Go implementation I've been using. The only curly bit with Go is the map key needs to be a pointer (as there is no hashcode and equals on object), and this can't be expressed in the type system (hence the reflection).

A company I worked at used a java implementation for years, but were never been able to convince outsiders of the benefits. People seems to be dissuaded by the extra argument in the API.

// empty interface aka any type
type INode interface { }

type Tree interface {
    Root() INode

    Add(parent INode, child INode) (success bool, otherParent INode)

    Children(parent INode) []INode

    Parent(child INode) (parent INode)

    Walk(CallbackFunc)
    ...
    TreeString() string
}

// The internal structure of the centralised tree
type ctree struct {
    // Map of parent to slice of children
    p2 map[INode][]INode
    // Map of child to parent
    c2 map[INode]INode
    // Hidden root Inode of the tree
    root    INode
}


func (t *ctree) Add(p INode, c INode) (bool, INode) {
    if reflect.TypeOf(c).Kind() != reflect.Ptr {
        panic("Child Inode is not a pointer")
    }
    if reflect.TypeOf(p).Kind() != reflect.Ptr {
        panic("Parent Inode is not a pointer")
    }
    if orginal, exist := t.c2[c]; exist {
        return false, orginal
    }
    t.p2[p] = append(t.p2[p], c)
    t.c2[c] = p
    return true, nil
}

parrt commented 7 years ago

@millergarym Hiya Gary. Yeah, the unfortunate reality is that the existing trees have hardcoded the topology in the nodes themselves. I looked at extracting the parent pointer but it is too baked into the parsing algorithm. Remember that the parse tree nodes created are not just an artifact of parsing; they are in fact an integral part of the parse for ANTLR, since we do grammar analysis at runtime. The immutable tree mechanism I'm proposing at least does not disturb other trees and it shares maximally. A change at a node forces a copy up the spine to the root but shares everything else.

ghost commented 7 years ago

A change at a node forces a copy up the spine to the root but shares everything else.

Wouldn't a persistent tree solve these issues? Maybe you can yank one out of Clojure.

parrt commented 7 years ago

I think that is how immutable trees work :) Maybe Clojure has fat nodes instead but same idea.

ymxdgyx commented 6 years ago

exsit interface remove or add child node?

ofdata commented 2 years ago

Any update on this ?

parrt commented 2 years ago

Hi. No update I am afraid. Probably not something that will be coming in the near future.

antlr / antlr4

do we need parse tree modification routines? #369