Support for trees and traversals

gissuebot commented 10 years ago

Original issue created by jvdneste on 2009-05-25 at 02:34 PM

As a suggestion; I find it quite useful!

/* * returns a depth-first iterator for a tree of T * @param f a function that returns the children of node of the tree * @param root The root element of the tree * @return / public static <T> Iterable<T> walk(final T root, final Function<? super T, ? extends Iterable<? extends T>> getChildren) { final Function<T, Iterable<? extends T>> walkFunc = new Function<T, Iterable<? extends T>>() { public Iterable<? extends T> apply(final T from) { return walk(from, getChildren, this); } }; return walk(root, getChildren, walkFunc); }

private static <T> Iterable<T> walk(final T root, final Function<? super T, ? extends Iterable<? extends T>> getChildren, final Function<T, Iterable<? extends T>> walkFunction) { return Iterables.concat( Collections.singleton(root), Iterables.concat( Iterables.transform( getChildren.apply(root), walkFunction))); }

And a shortcut in case the nodes implement Iterable:

@SuppressWarnings("unchecked") public static <T extends Iterable<T>> Iterable<T> walk(final T root) { return walk(root, Functions.identity(), WalkIterableFunction.INSTANCE); }

@SuppressWarnings("unchecked") private static enum WalkIterableFunction implements Function { INSTANCE;

public Object apply(final Object from) { return walk(from, Functions.identity(), this); } };

gissuebot commented 10 years ago

Original comment posted by zeitlinger on 2009-06-10 at 07:46 AM

Maybe it would be useful to have a Tree interface

Tree<T> { T getParent(); Collection<T> getChildren();

}

also see issue 187

gissuebot commented 10 years ago

Original comment posted by gpampara on 2009-07-22 at 04:02 PM

Tree's are inherently recursive by definition. Each child of a node a Tree itself, with or without children nodes. As a result, the above interface wouldn't really be expressive enough (especially as the "key" value for the is not considered).

The next point to consider is that tree traversal can consist of pre-order, in-order and post-order traversals, not to mention breadth-first.

gissuebot commented 10 years ago

Original comment posted by kevinb9n on 2009-09-17 at 05:41 PM

I have always very much wanted to have a standard TreeNode<T> interface. I will give it more thought when I can. I think that is a far better approach than the Function- based one.

The interface might end up needing to be something like this

interface TreeNode<T extends TreeNode<T, C>, C extends Iterable<T>> { // optional; returns null if root TreeNode<T, C> parent();

C children();

}

This way, different types of trees can use different types of collections (list, set, multiset, what-have-you). It's a little confusing to grasp at first though. We'll see what shakes out.

Status: Accepted Labels: post-1.0

gissuebot commented 10 years ago

Original comment posted by kevinb9n on 2009-09-17 at 05:45 PM

(No comment entered for this change.)

Labels: -post-1.0, Milestone-Post1.0

gissuebot commented 10 years ago

Original comment posted by kevinb9n on 2009-09-17 at 05:57 PM

(No comment entered for this change.)

Labels: Type-Enhancement

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2009-10-02 at 11:35 PM

You don't really need a defined data structure to implement the most common traversals (breadth, pre/post- order, and full depth-first - tracking start/finish times), as the original poster points out. I would implement it differently (see below), but the principle holds.

That's not to say there isn't a good use case for a tree data structure, just that those traversals don't require it. If adding both things, it would be a good idea for some static utils methods that simplify the common use case when you do want to traverse an explicit tree structure instead of the functional way. There's no reason the API can't be expressive enough for both strategies.

Additionally, if going down that data structure road, you may seriously want to consider keeping it consistent with the Collections model of data structures being containers for user-supplied objects. In that light, I think TreeNode is inappropriate for the same reason that ListNode would be. You don't ask the elements of a List for previous/next elements, you ask the List data structure itself.

One drawback to this is that it can be very difficult to maintain the invariant of being a Tree while building one. My mutable structures tend to be either general graphs or forests, with an ability to get tree views (tree rooted at a given forest node, e.g.).

As an aside, I'm in the process of a complete overhaul of my own graph library, having been inspired by your excellent work with GCL - in particular the realization of how much readability can improve an API (and having written it pre-generics). This is a working concept for how traversals would look (not compilable, poorly formatted, no method bodies - keeping it terse). For everyone out there, feel free to use any ideas you like.

public interface Step< E > { E from(); E to(); }

public interface PruningIterator< E > extends Iterator< E > { // Don't explore beyond the last object returned by next() public void prune(); }

public class Steps { // convenience methods, so users don't have to explicitly create Steps

public static Iterator< Step< E > > wrap ( E from, Iterator< E > toIterator )

public static Function< E, Iterator< Step< E > > > wrap ( Function< E, Iterator< E > > adjacencyFunction ) }

public class Traversals { public static PruningIterator< Step< E > > preOrder ( E start, Function< E, Iterator< Step< E > > > adjacencyFunction )

// can't prune() a post-order traversal public static Iterator< Step< E > > postOrder ( E start, Function< E, Iterator< Step< E > > > adjacencyFunction )

public static Iterator< Step< E > > topologicalSort ( Iterator< E > nodeIterator, Function< E, Iterator< Step< E > > > adjacencyFunction )

// etc. plus similar methods taking just Function< E, Iterator< E > > and doing the wrapping for you. }

All of these can be implemented using stacks, no recursion, without too much difficulty. The primary reason for the Step interface is that you often want to know from which node you traversed, and this information would not otherwise be available.

I can use this same API to traverse the file system, up a tree instead of down, or any other graph-like structure, without creating an explicit data structure to do it. All I need to define is that function which takes a node and returns an iterator of adjacent nodes. Not creating an actual data structure (like a Collection) is useful when the backing data is huge, and perhaps not even queryable in a global way. Having to implement things like size() can really be a pain sometimes, and at work we've often just decided to throw an UnsupportedOperationException instead.

At any rate, a work in progress, but that's the general flavor. The edges in my data structures also contain user-defined objects, so I need to allow for that without polluting the rest of the API. Still working through the implications of that.

gissuebot commented 10 years ago

Original comment posted by jvdneste on 2009-12-10 at 05:34 PM

I recently needed a version of my walk iterable from above with a possibility to get the path of a node to the root. So I needed a TreeNode interface, without needing the children (although it could easily be added).

I'm attaching the relevant bits (thrown together so it would compile, but i did not spend too much time on it so please excuse for errors). Scroll down for an example in printWalkWithParentsTest()

gissuebot commented 10 years ago

Original comment posted by kevinb@google.com on 2010-07-30 at 03:53 AM

(No comment entered for this change.)

Labels: -Milestone-Post1.0

gissuebot commented 10 years ago

Original comment posted by kevinb@google.com on 2011-07-13 at 06:18 PM

(No comment entered for this change.)

Status: Triaged

gissuebot commented 10 years ago

Original comment posted by kevinb@google.com on 2011-07-16 at 08:37 PM

(No comment entered for this change.)

Status: Accepted

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-01 at 05:35 PM

(No comment entered for this change.)

Owner: wasserman.louis

gissuebot commented 10 years ago

Original comment posted by kevinb@google.com on 2011-09-02 at 11:27 PM

Not sure if I pointed out that we have stopped thinking about the "TreeNode interface" approach.

I believe it is common to come into contact with objects that are tree-structured, but which you don't control, so you can't get to implement the TreeNode interface you want them to. Examples: java.io.File, or the containment tree of Java program elements (i.e. AnnotatedElement).

If we provided a TreeNode interface you'd have to create an adapter instance for every single node. And your adapter would, every time it grabs its list of children, have to wrap each of them, and then you'd start to want to hold on to these adapters and in the end, you've just created a whole entire parallel tree is all you've done.

This sucks. What Louis and I propose is this:

interface TreeViewer<T> { Iterable<T> children(T node); }

abstract class BinaryTreeViewer implements TreeViewer<T> { Optional<T> leftChild(); Optional<T> rightChild(); // implements children() to return those that are "present" }

class Trees { static <T> Iterable<T> breadthFirstView(T root, TreeViewer<T> viewer) {...} static <T> Iterable<T> preOrderView(T root, TreeViewer<T> viewer) {...} static <T> Iterable<T> postOrderView(T root, TreeViewer<T> viewer) {...} static <T> Iterable<T> inOrderView(T root, BinaryTreeViewer<T> viewer) {...}

// perhaps also with void "visit" versions as well

}

Any thoughts out there?

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2011-09-03 at 09:06 PM

I believe this is the right direction. I still maintain that a more general interface will be more useful. I don't mean more general than TreeViewer<T> in syntax, but in implied semantics. The language "children" and "TreeViewer" have clear implications that I think are unnecessary. It's just a name change from that to this:

interface AdjacencyFunction<T> { Iterable<T> adjacent( T node ); }

And that interface works just as well for breadthFirst/preOrder/postOrder traversals. The util class then becomes "Traversals". The only difference is that you would have to clearly document the traversal behavior when given a Adjacency that produces cycles, although you should probably do that anyway (think soft links in the file system).

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2011-09-03 at 09:32 PM

Wasn't quite done, hit the "Save changes" button by accident.

With class name of Traversals or similar, the method names can be shortened. It just becomes Traversals.breadthFirst( T, Function< T, Iterable< T > > ).

Note that both "children" and "parent" are just specific adjacency relationships. Pre-order for the parent relationship goes from the given node to the root, post-order goes from the root to the given node. You get a lot of flexibility like this for free.

Given that this gets more graph-like, static factory methods for walks (follows the first node returned from the adjacency Iterable, ideal for parent traversal) and a topological sort are appropriate.

This doesn't directly address the special case of a binary tree, and in-order traversal. We've never had that use case where I work, so I honestly haven't given much thought to how that fits in to this strategy.

I haven't found a clean way to handle cycles within the library. I ended up with a factory method which turns a given adjacency function into an acyclic one by keeping track of what it has hit, but I really don't like that solution because it turns a stateless function into a stateful one that is not reusable. There must be a better way without resorting to adding distinct acyclic traversal factory methods, but I haven't found it.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2011-09-03 at 09:47 PM

Oh yeah, a couple more things. Sorry, should have composed my thoughts first. Using just Function<T, Iterable<T>> allows using the rest of Guava's Functions, Predicates, etc. with these adjacency functions.

We've also found it useful to think of traversals as, themselves, adjacency functions. You can then get some interesting behavior by composing them. For example (forgive the names, they need changing):

class SomeUtilClass { static <T> Function<T, Iterable<T>> breadthFirst( Function<T, Iterable<T>> ); // etc. }

That method accepts one adjacency function and produces another. The resulting function will accept a node and return, wrt the argument function, all reachable nodes in a breadth-first order.

That and some filtering allows you to simply do things like "Find all references to all declared variables" when working with a linked AST. Or complex traversals of social graphs.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-05 at 11:01 PM

I'm pretty strongly against supporting graphs or anything that is not strictly a tree. That's trying to "do too much."

At the moment, we have a nice, simple abstraction that matches a clearly defined set of use cases, and does it extremely well. Guava isn't currently in the business of manipulating graphs, and changing that is beyond the scope of this issue.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2011-09-06 at 01:44 AM

Point taken, and I don't really disagree. Just saying that there isn't a logical code change involved, only names. The code for breadth- and depth-first traversals is exactly the same. But again, binary trees are another thing entirely.

Sorry for going off the deep end there. I just got a little excited that the brain trust behind Guava would be brought to bear on abstractions I deal with daily. It would be nice to have something 100x better tested than what I've been able to cobble together during the calm moments between putting out fires. I never seem to have enough time to go back and fix the abstractions, no sellable business case for it.

gissuebot commented 10 years ago

Original comment posted by jim.andreou on 2011-09-06 at 02:13 AM

There is such a thing as "trying to do too much" and a thing as "too simple", I'm not sure which is which here. Even a filesystem is a tree only if you ignore symlinks/treat as leaves.

I'm surprized that we have a "clearly defined set of use cases" here (that's a rarity!), is it in copy-pastable form?

PS: Instead of Iterable<T>, it is richer to return Iterable<Stack<T>> (for some persistent Stack), for basically the same cost, and which can be viewed as Iterable<T>, but that would indeed be in the "too much" direction.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-06 at 02:25 AM

My goal with this approach was to be dumb, simple, and utterly straightforward. I think I've accomplished this goal.

(And while filesystems are convoluted and complicated, XML documents might be a good example of genuine trees.)

gissuebot commented 10 years ago

Original comment posted by jim.andreou on 2011-09-06 at 04:06 AM

Heh, I considered mentioning XML but thought that you wouldn't be interested.

An XML document is only a tree if you ignore ID/IDREF(S) or treat them as leaves.

gissuebot commented 10 years ago

Original comment posted by Maaartinus on 2011-09-14 at 04:00 PM

Maybe this is irrelevant because of some previous posting, but for me the following worked well:

public interface TreeClassAdapter<T> { Iterable<T> children(T node); T parent(T node); Class<T> nodeClass(); boolean isLeaf(T node); // useful to distinguish between nodes which could possible have children but happen to have none (like empty directories) and proper leaves (like files), cf. javax.swing.tree.TreeModel }

This way one can easily organize arbitrary objects in trees, without having to create a wrapper for each object (which is quite a PITA, since you need to switch back and forth often). Adapting objects to this interface is easy, for example for java.io.File it takes just a few lines.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-14 at 04:10 PM

I have to admit that my reaction to that is, ewwwwwwwwwww.

I think the current interface serves most conceivable needs without overspecifying the tree format.

gissuebot commented 10 years ago

Original comment posted by jvdneste on 2011-09-14 at 05:10 PM

Interface TreeViewer is essentially just a Function<T,Iterable<T>>

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-14 at 05:49 PM

Correct! But calling it that would make the API much less clear and more awkward to use. Doing it properly, like this, makes it more natural and extensible -- for example, the BinaryTreeViewer abstract class would be difficult or impossible to provide.

gissuebot commented 10 years ago

Original comment posted by jvdneste on 2011-09-14 at 10:38 PM

Fair enough. I understand the benefits of a specific interface. I'm just easily tempted to reduce plumbing because I know I might need a Function that does the same sooner or later anyhow. (For instance to do a concat(transform(...)) in a different context)

Also, occasionally, I find it useful/necessary to build a parallel tree.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-09-15 at 02:07 PM

Explain please what you mean by "parallel tree"? (I can imagine several different meanings, so I'm interested in specific use cases.)

gissuebot commented 10 years ago

Original comment posted by jvdneste on 2011-09-15 at 03:56 PM

For instance, you have a tree of nodes in which each node only has references to its children. What if you need to process this tree in a way you need parent references? You could map every node of the tree to a new object with a reference to a parent of the same type and a reference to the original node. That would be a parallel tree.

I have needed this in the past and wrote the following code (which you might reasonably call ugly/bad code):

clarification: walk is preorder traversal; map is transform; mapcat is concat(transform(...))

public static <T> Iterable<T> walk(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren) {
   final Function<T, Iterable<? extends T>> fnWalk = new Function<T, Iterable<? extends T>>() {
      @Override
      public Iterable<? extends T> apply(final T from) {
         return walk(from, fnGetChildren, this);
      }
   };
   return walk(root, fnGetChildren, fnWalk);
}

private static <T> Iterable<T> walk(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren,
      final Function<? super T, ? extends Iterable<? extends T>> fnWalk) {
   return concat(singleton(root), mapcat(fnWalk, fnGetChildren.apply(root)));
}

public interface TreeNode<T> {
   public TreeNode<T> parent();
   public T item();
}

public static <T, TreeNodeType extends TreeNode<T>> Iterable<TreeNodeType> walkWithParents(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren,
      final Function2<? super TreeNodeType, ? super T, ? extends TreeNodeType> nodeFactory) {
   final Function<TreeNodeType, Iterable<TreeNodeType>> fnPairChildren = new Function<TreeNodeType, Iterable<TreeNodeType>>() {
      @Override
      public Iterable<TreeNodeType> apply(final TreeNodeType parent) {
         return map(Functions2.bind1st(nodeFactory, parent), fnGetChildren.apply(parent.item()));
      }
   };
   final Function<TreeNodeType, Iterable<TreeNodeType>> fnWalk = new Function<TreeNodeType, Iterable<TreeNodeType>>() {
      @Override
      public Iterable<TreeNodeType> apply(final TreeNodeType from) {
         return walk(from, fnPairChildren, this);
      }
   };
   return walk(nodeFactory.apply(null, root), fnPairChildren, fnWalk);
}

gissuebot commented 10 years ago

Original comment posted by christoph.hoesler on 2011-10-05 at 04:27 PM

Implementing a traversal based on the comments here I stumbled upon something you might want to consider. I needed to keep track of the current depth during the traversal, so I used an Iterator based (rather that Iterable based) approach using a special TreeIterator:

public static <T> TreeIterator<T> postOrderView(final T root, final Function<? super T, ? extends Iterator<? extends T>> childrenFunction);

public interface TreeIterator<T> extends Iterator<T> { int depth(); }

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-10-05 at 05:26 PM

Interesting. I'm not sure that's a sufficiently common use case to get special treatment, but I'll look into it.

gissuebot commented 10 years ago

Original comment posted by jim.andreou on 2011-10-06 at 12:19 AM

The most general approach is providing an iterator of paths (from root to each node), which could be viewed as an iterator of values (much like Jared's LinkedNodeList does). The best representation for these paths are persistent stacks, the whole cost is one object per node, and they are immutable. Just in case anyone really wants to generalize

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-10-06 at 12:52 AM

That could work.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2011-11-15 at 03:50 PM

Issue #703 has been merged into this issue.

gissuebot commented 10 years ago

Original comment posted by fry@google.com on 2011-12-10 at 03:38 PM

(No comment entered for this change.)

Labels: Package-Collect

gissuebot commented 10 years ago

Original comment posted by anjan.bacchu on 2012-01-18 at 04:30 AM

When can we expect TreeViewer and other Tree related Interfaces/Classes to be available ? I was browsing the 11.0.1 javadocs and could not locate these. Thank you,

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2012-01-19 at 01:38 AM

I believe we're testing some ideas internally, but this is just such a broadly applicable and common need that it's worth taking lots of time to experiment with different designs.

gissuebot commented 10 years ago

Original comment posted by kevinb@google.com on 2012-05-30 at 07:43 PM

(No comment entered for this change.)

Labels: -Type-Enhancement, Type-Addition

gissuebot commented 10 years ago

Original comment posted by se.solovyev on 2013-02-02 at 11:36 AM

If you want to use Tree data structure and don't want to wait until it will be added to Guava I suggest using my implementation which can be found on github: https://github.com/serso/common/tree/master/collections-tree/src/main/java/org/solovyev/common/collections/tree

The interfaces are self explaining. One can start from the Trees.java class.

gissuebot commented 10 years ago

Original comment posted by lowasser@google.com on 2013-03-09 at 06:13 PM

Initial versions of TreeTraverser and BinaryTreeTraverser are slated for 15.0: https://github.com/google/guava/blob/master/guava/src/com/google/common/collect/TreeTraverser.java

Status: Started Owner: lowasser@google.com Labels: Milestone-Release15

gissuebot commented 10 years ago

Original comment posted by knut.wannheden on 2013-03-09 at 06:46 PM

FWIW, I think something like the EMF TreeIterator (see http://download.eclipse.org/modeling/emf/emf/javadoc/2.4.3/org/eclipse/emf/common/util/TreeIterator.html) which is basically an Iterator with an additional prune() method is indeed very useful.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2013-03-12 at 10:14 PM

If adding an Iterator subinterface with a prune() method, then I suggest naming the interface more generically than "TreeIterator". Pruning is simply saying "I don't want to traverse beyond the last node any further". I haven't delved into the committed code, but I suspect that the traversal will work for any DAG just fine. Actually, it looks like pre-order and breadth-first will work for any graph, cyclic or not, provided you don't continue to call next() ad infinitum.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2013-03-13 at 04:10 AM

My model has changed from the last time I responded here. This isn't a suggestion, merely throwing it out there. I have edges (user-defined, weights e.g.) as well, but I've elided that here and changed a few things as a result. This is minimal pseudo-code, for terseness.

interface Walk<T> {
    T getFrom();  // convenience, first element of path
    T getTo()     // convenience, last element of path
    Iterable<T> getPath();
}

interface Traverser<T> extends Function<T, Iterable<Walk<T>>>

class Traversers {
    static <T> Traverser<T> empty()
    static <T> Traverser<T> preOrder( Traverser<T> adjacency )
    // similar postOrder, breadthFirst, leaves, a few others
}

In my use case, I need the entire path taken to reach each visited vertex. Storage isn't bad if you're clever (similar to lisp cons cell, as Dimitris Andreou mentioned). If you don't need that, replace Walk<T> with just T.

Traverser<T> is both how one defines an adjacency function (typically user-defined), and how pre/post/... traversals are defined. An adjacency function simply returns very short walks. In the language of your implementation, both children(T) and preOrderTraveral(T) return Iterable<T>.

I'm pretty sure your reaction will be that this over-generalizes, and I can't dispute that. It's in my nature. I could certainly see not using Function here, and using a specific method name for Traverser instead.

gissuebot commented 10 years ago

Original comment posted by wasserman.louis on 2013-03-13 at 07:11 PM

@ray: have you looked at the current implementation linked above?

(I claim that it's general enough to let you optionally keep track of the paths in the tree: you just write a TreeTraverser<Walk<T>> instead of a TreeTraverser<T>.)

I'm not sure what, if any, features we'll include to prune the traversal, but we might do something along the lines of doNotExpand(Predicate<T>).

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2013-03-14 at 02:26 AM

@Louis

Thanks for pointing out the potential to use a TreeTraverser<Walk<T>>, I had not seen that possibility. That would also work with my actual Walk including edge objects. I'll have to take a closer look in light of that to see where the gaps are between the two, as I'd rather use Guava than my own. One difference is that a user would need to supply the root node as a trivial Walk, which is slightly cumbersome. I still like the fact that adjacency (children) and complex traversals are the same kind of thing, but I have to reexamine my use cases to see if that feature is actually useful.

The other kinds of traversals I have needed in the past are: depth-first (both out and back, each node visited twice), topological sort (taking an Iterable of nodes, not just a single root node), and leaves. To be really useful, a depth-first iterator has to be able to tell you which direction you're traversing.

Re: pruning. In my experience, whether or not to prune usually depends on your program state at that point in the traversal. For example, you might prune if exploration beyond that node is pointless because you've already found a lower cost result.

gissuebot commented 10 years ago

Original comment posted by lowasser@google.com on 2013-03-14 at 04:03 PM

That is true, but we're not trying to be the be all and end all of tree traversals: if we make a wide variety of use cases simpler with a minimum of API complexity, we've done our job.

In particular, I don't want to track the full path of the traversal in every case because that would incur prohibitive memory costs in cases where the tree is implicit or not held in memory all at once, and I am uncertain that adding a whole new iteration API to deal with pruning helps enough additional users to merit the added complexity.

gissuebot commented 10 years ago

Original comment posted by cgdecker on 2013-03-14 at 06:40 PM

To me, the useful thing about TreeTraverser is being able to iterate over the elements of the tree as if it were flat, when that's what you want to do. If you need to deal with the tree in a more tree-like way, do that. But I don't really like the idea of trying to combine treating the tree as flat and as a tree by introducing something cumbersome like prune() and requiring you to iterate with a special kind of Iterator, etc. The idea of using a Predicate to determine whether or not the subtree of a node should be iterated seems ok to me, but nothing more than that.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2013-03-15 at 08:02 PM

re: pruning...

Extending Iterator is ugly. I've done that in the past, and it's not pretty, and even uglier with Iterable. Returning an Iterable<Something<T>>, where Something has a prune() method, would be better. Not good, just better. But then iteration isn't nearly as clean for the common use case. That argues for more methods, pruning vs. normal pre-order and breadth-first. And then your API is uglier. I need pruning, and I'm not even sure that's worth it. I can always write my own using a given TreeTraverser.

If using a stateless Predicate for limited pruning, I would suggest creating a new TreeTraverser, similar to Joiner.skipNulls(). I think that's what you suggested with doNotExpand(Predicate), but I'm not sure.

gissuebot commented 10 years ago

Original comment posted by cgdecker on 2013-03-15 at 08:12 PM

If using a stateless Predicate for limited pruning, I would suggest creating a new TreeTraverser, similar to Joiner.skipNulls(). I think that's what you suggested with doNotExpand(Predicate), but I'm not sure.

Yes, exactly.

gissuebot commented 10 years ago

Original comment posted by ray.a.conner on 2013-03-15 at 08:19 PM

re: stuff other than pruning...

I didn't mean to imply that TreeTraverser should keep track of paths from root. I agree that TreeTraverser<Walk<T>> would do that, and it's just my problem to define children(Walk<T>) correctly.

I wasn't really pushing those other use cases (topo sort, ...), just bringing them up because your project has a good history of trying to solve other people's problems when it makes sense for Guava. Consider it a disclosure of how one of your users uses traversals. Any of these can be added by users in their own code easily enough.

gissuebot commented 10 years ago

Original comment posted by Lichtenberger.Johannes on 2013-03-20 at 11:26 PM

I think it could also be something along the lines:

https://github.com/JohannesLichtenberger/sirix/blob/master/bundles/sirix-core/src/main/java/org/sirix/axis/visitor/VisitorDescendantAxis.java

and the package:

https://github.com/JohannesLichtenberger/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/api/visitor

which is modeled after the File walker Java7 API and might be really useful (however I don't know why the github sourcecode formatting, namely the tab-size is so ugly).

Probably the whole package (plus sub-packages) https://github.com/JohannesLichtenberger/sirix/tree/master/bundles/sirix-core/src/main/java/org/sirix/axis

is also interesting. The filters implement Predicate since some time ago and I've used some code from AbstractIterator of Guava to implement "AbstractAxis". Besides other things I'm currently thinking about how to integrate custom user-specified objects (currently it's used to store/query XML documents (and soon JSON) based on different versioning strategies). However, I think it's very simple to add custom types, probably implementing a simple Write- and Read-transaction which modifies/reads custom node-types. The user has to provide it's own serialization/deserialization if the in-memory store isn't selected. I'll likely implement insertion/update/delete/move-methods for a generic type parameter (I think the workaround with JAXB annotations is rather ugly).

gissuebot commented 10 years ago

Original comment posted by cgdecker@google.com on 2013-07-03 at 05:55 PM

Since TreeTraverser is in for 15.0, I'm gonna mark this as fixed. I think any specific enhancements or additions should be requested in separate issues.

Status: Fixed

google / guava

Support for trees and traversals #174