Support for trees and traversals

GoogleCodeExporter commented 9 years ago


As a suggestion; I find it quite useful!

/**
 * returns a depth-first iterator for a tree of T
 * @param f a function that returns the children of node of the tree
 * @param root The root element of the tree
 * @return
 */
public static <T> Iterable<T> walk(final T root,
    final Function<? super T, ? extends Iterable<? extends T>> getChildren) 
{
  final Function<T, Iterable<? extends T>> walkFunc =
      new Function<T, Iterable<? extends T>>() {
    public Iterable<? extends T> apply(final T from) {
      return walk(from, getChildren, this);
    }
  };
  return walk(root, getChildren, walkFunc);
}

private static <T> Iterable<T> walk(final T root,
  final Function<? super T, ? extends Iterable<? extends T>> getChildren,
    final Function<T, Iterable<? extends T>> walkFunction) {
  return Iterables.concat(
    Collections.singleton(root),
    Iterables.concat(
      Iterables.transform(
        getChildren.apply(root),
        walkFunction)));
}

And a shortcut in case the nodes implement Iterable:

@SuppressWarnings("unchecked")
public static <T extends Iterable<T>> Iterable<T> walk(final T root) {
  return walk(root, Functions.identity(), WalkIterableFunction.INSTANCE);
}

@SuppressWarnings("unchecked")
private static enum WalkIterableFunction implements Function {
  INSTANCE;

  public Object apply(final Object from) {
    return walk(from, Functions.identity(), this);
  }
};

Original issue reported on code.google.com by jvdne...@gmail.com on 25 May 2009 at 2:34

GoogleCodeExporter commented 9 years ago

Maybe it would be useful to have a Tree interface 

Tree<T> {
  T getParent();
  Collection<T> getChildren();

}

also see issue 187

Original comment by zeitlin...@gmail.com on 10 Jun 2009 at 7:46

GoogleCodeExporter commented 9 years ago

Tree's are inherently recursive by definition. Each child of a node a Tree 
itself,
with or without children nodes. As a result, the above interface wouldn't 
really be
expressive enough (especially as the "key" value for the is not considered).

The next point to consider is that tree traversal can consist of pre-order, 
in-order
and post-order traversals, not to mention breadth-first.

Original comment by gpampara@gmail.com on 22 Jul 2009 at 4:02

GoogleCodeExporter commented 9 years ago

I have always very much wanted to have a standard TreeNode<T> interface. I will 
give 
it more thought when I can. I think that is a far better approach than the 
Function-
based one.

The interface _might_ end up needing to be something like this

  interface TreeNode<T extends TreeNode<T, C>, C extends Iterable<T>> {
    // optional; returns null if root
    TreeNode<T, C> parent();

    C children();
  }

This way, different types of trees can use different types of collections 
(list, set, 
multiset, what-have-you).  It's a little confusing to grasp at first though.  
We'll 
see what shakes out.

Original comment by kevin...@gmail.com on 17 Sep 2009 at 5:41

Changed title: Simple, standard TreeNode interface and traversals
Changed state: Accepted
Added labels: post-1.0

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 17 Sep 2009 at 5:45

Added labels: Milestone-Post1.0
Removed labels: post-1.0

GoogleCodeExporter commented 9 years ago

Original comment by kevin...@gmail.com on 17 Sep 2009 at 5:57

Added labels: Type-Enhancement

GoogleCodeExporter commented 9 years ago

You don't really need a defined data structure to implement the most common 
traversals (breadth, pre/post-
order, and full depth-first - tracking start/finish times), as the original 
poster points out. I would implement 
it differently (see below), but the principle holds.

That's not to say there isn't a good use case for a tree data structure, just 
that those traversals don't require 
it. If adding both things, it would be a good idea for some static utils 
methods that simplify the common use 
case when you do want to traverse an explicit tree structure instead of the 
functional way. There's no reason 
the API can't be expressive enough for both strategies.

Additionally, if going down that data structure road, you may seriously want to 
consider keeping it consistent 
with the Collections model of data structures being containers for 
user-supplied objects. In that light, I think 
TreeNode is inappropriate for the same reason that ListNode would be. You don't 
ask the elements of a List 
for previous/next elements, you ask the List data structure itself.

One drawback to this is that it can be very difficult to maintain the invariant 
of being a Tree while building 
one. My mutable structures tend to be either general graphs or forests, with an 
ability to get tree views (tree 
rooted at a given forest node, e.g.).

As an aside, I'm in the process of a complete overhaul of my own graph library, 
having been inspired by your 
excellent work with GCL - in particular the realization of how much readability 
can improve an API (and 
having written it pre-generics). This is a working concept for how traversals 
would look (not compilable, 
poorly formatted, no method bodies - keeping it terse). For everyone out there, 
feel free to use any ideas you 
like.

public interface Step< E >
{
  E from();
  E to();
}

public interface PruningIterator< E > extends Iterator< E >
{
  // Don't explore beyond the last object returned by next()
  public void prune();
}

public class Steps
{
  // convenience methods, so users don't have to explicitly create Steps

  public static Iterator< Step< E > > wrap
    ( E from,
      Iterator< E > toIterator )

  public static Function< E, Iterator< Step< E > > > wrap
    ( Function< E, Iterator< E > > adjacencyFunction )
}

public class Traversals
{
  public static PruningIterator< Step< E > > preOrder
    ( E start,
      Function< E, Iterator< Step< E > > > adjacencyFunction )

  // can't prune() a post-order traversal
  public static Iterator< Step< E > > postOrder
    ( E start,
      Function< E, Iterator< Step< E > > > adjacencyFunction )

  public static Iterator< Step< E > > topologicalSort
    ( Iterator< E > nodeIterator,
      Function< E, Iterator< Step< E > > > adjacencyFunction )

  // etc. plus similar methods taking just Function< E, Iterator< E > > and doing the wrapping for you.
}

All of these can be implemented using stacks, no recursion, without too much 
difficulty. The primary reason 
for the Step interface is that you often want to know from which node you 
traversed, and this information
would not otherwise be available.

I can use this same API to traverse the file system, up a tree instead of down, 
or any other graph-like 
structure, without creating an explicit data structure to do it. All I need to 
define is that function which takes 
a node and returns an iterator of adjacent nodes. Not creating an actual data 
structure (like a Collection) is 
useful when the backing data is huge, and perhaps not even queryable in a 
global way. Having to implement 
things like size() can really be a pain sometimes, and at work we've often just 
decided to throw an 
UnsupportedOperationException instead.

At any rate, a work in progress, but that's the general flavor. The edges in my 
data structures also contain 
user-defined objects, so I need to allow for that without polluting the rest of 
the API. Still working through 
the implications of that.

Original comment by ray.a.co...@gmail.com on 2 Oct 2009 at 11:35

GoogleCodeExporter commented 9 years ago

I recently needed a version of my walk iterable from above with a possibility 
to get 
the path of a node to the root. So I needed a TreeNode interface, without 
needing the 
children (although it could easily be added).

I'm attaching the relevant bits (thrown together so it would compile, but i did 
not 
spend too much time on it so please excuse for errors). Scroll down for an 
example in 
printWalkWithParentsTest()

Original comment by jvdne...@gmail.com on 10 Dec 2009 at 5:34

Attachments:

Ita.java

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 Jul 2010 at 3:53

Removed labels: Milestone-Post1.0

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 13 Jul 2011 at 6:18

Changed state: Triaged

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 16 Jul 2011 at 8:37

Changed state: Accepted

GoogleCodeExporter commented 9 years ago

Original comment by wasserman.louis on 1 Sep 2011 at 5:35

GoogleCodeExporter commented 9 years ago

Not sure if I pointed out that we have stopped thinking about the "TreeNode 
interface" approach.

I believe it is common to come into contact with objects that are 
tree-structured, but which you don't control, so you can't get to implement the 
TreeNode interface you want them to. Examples: java.io.File, or the containment 
tree of Java program elements (i.e. AnnotatedElement).

If we provided a TreeNode interface you'd have to create an adapter instance 
for every single node. And your adapter would, every time it grabs its list of 
children, have to wrap each of them, and then you'd start to want to hold on to 
these adapters and in the end, you've just created a whole entire parallel tree 
is all you've done.

This sucks.  What Louis and I propose is this:

  interface TreeViewer<T> { 
    Iterable<T> children(T node);
  }

  abstract class BinaryTreeViewer implements TreeViewer<T> {
    Optional<T> leftChild();
    Optional<T> rightChild();
    // implements children() to return those that are "present"
  }

  class Trees {
    static <T> Iterable<T> breadthFirstView(T root, TreeViewer<T> viewer) {...}
    static <T> Iterable<T> preOrderView(T root, TreeViewer<T> viewer) {...}
    static <T> Iterable<T> postOrderView(T root, TreeViewer<T> viewer) {...}
    static <T> Iterable<T> inOrderView(T root, BinaryTreeViewer<T> viewer) {...}

    // perhaps also with void "visit" versions as well
  }

Any thoughts out there?

Original comment by kevinb@google.com on 2 Sep 2011 at 11:27

Changed title: Support for trees and traversals

GoogleCodeExporter commented 9 years ago

I believe this is the right direction. I still maintain that a more general 
interface will be more useful. I don't mean more general than TreeViewer<T> in 
syntax, but in implied semantics. The language "children" and "TreeViewer" have 
clear implications that I think are unnecessary. It's just a name change from 
that to this:

  interface AdjacencyFunction<T> {
    Iterable<T> adjacent( T node );
  }

And that interface works just as well for breadthFirst/preOrder/postOrder 
traversals. The util class then becomes "Traversals". The only difference is 
that you would have to clearly document the traversal behavior when given a 
Adjacency that produces cycles, although you should probably do that anyway 
(think soft links in the file system).

Original comment by ray.a.co...@gmail.com on 3 Sep 2011 at 9:06

GoogleCodeExporter commented 9 years ago

Wasn't quite done, hit the "Save changes" button by accident.

With class name of Traversals or similar, the method names can be shortened. It 
just becomes Traversals.breadthFirst( T, Function< T, Iterable< T > > ).

Note that both "children" and "parent" are just specific adjacency 
relationships. Pre-order for the parent relationship goes from the given node 
to the root, post-order goes from the root to the given node. You get a lot of 
flexibility like this for free.

Given that this gets more graph-like, static factory methods for walks (follows 
the first node returned from the adjacency Iterable, ideal for parent 
traversal) and a topological sort are appropriate.

This doesn't directly address the special case of a binary tree, and in-order 
traversal. We've never had that use case where I work, so I honestly haven't 
given much thought to how that fits in to this strategy.

I haven't found a clean way to handle cycles within the library. I ended up 
with a factory method which turns a given adjacency function into an acyclic 
one by keeping track of what it has hit, but I really don't like that solution 
because it turns a stateless function into a stateful one that is not reusable. 
There must be a better way without resorting to adding distinct acyclic 
traversal factory methods, but I haven't found it.

Original comment by ray.a.co...@gmail.com on 3 Sep 2011 at 9:32

GoogleCodeExporter commented 9 years ago

Oh yeah, a couple more things. Sorry, should have composed my thoughts first. 
Using just Function<T, Iterable<T>> allows using the rest of Guava's Functions, 
Predicates, etc. with these adjacency functions.

We've also found it useful to think of traversals as, themselves, adjacency 
functions. You can then get some interesting behavior by composing them. For 
example (forgive the names, they need changing):

  class SomeUtilClass {
    static <T> Function<T, Iterable<T>> breadthFirst( Function<T, Iterable<T>> );
    // etc.
  }

That method accepts one adjacency function and produces another. The resulting 
function will accept a node and return, wrt the argument function, all 
reachable nodes in a breadth-first order.

That and some filtering allows you to simply do things like "Find all 
references to all declared variables" when working with a linked AST. Or 
complex traversals of social graphs.

Original comment by ray.a.co...@gmail.com on 3 Sep 2011 at 9:47

GoogleCodeExporter commented 9 years ago

I'm pretty strongly against supporting graphs or anything that is not strictly 
a tree.  That's trying to "do too much."

At the moment, we have a nice, simple abstraction that matches a clearly 
defined set of use cases, and does it extremely well.  Guava isn't currently in 
the business of manipulating graphs, and changing that is beyond the scope of 
this issue.

Original comment by wasserman.louis on 5 Sep 2011 at 11:01

GoogleCodeExporter commented 9 years ago

Point taken, and I don't really disagree. Just saying that there isn't a 
logical code change involved, only names. The code for breadth- and depth-first 
traversals is exactly the same. But again, binary trees are another thing 
entirely.

Sorry for going off the deep end there. I just got a little excited that the 
brain trust behind Guava would be brought to bear on abstractions I deal with 
daily. It would be nice to have something 100x better tested than what I've 
been able to cobble together during the calm moments between putting out fires. 
I never seem to have enough time to go back and fix the abstractions, no 
sellable business case for it.

Original comment by ray.a.co...@gmail.com on 6 Sep 2011 at 1:44

GoogleCodeExporter commented 9 years ago

There is such a thing as "trying to do too much" and a thing as "too simple", 
I'm not sure which is which here. Even a filesystem is a tree only if you 
ignore symlinks/treat as leaves. 

I'm surprized that we have a "clearly defined set of use cases" here (that's a 
rarity!), is it in copy-pastable form?

PS: Instead of Iterable<T>, it is richer to return Iterable<Stack<T>> (for some 
persistent Stack), for basically the same cost, and which can be viewed as 
Iterable<T>, but that would indeed be in the "too much" direction.

Original comment by jim.andreou on 6 Sep 2011 at 2:13

GoogleCodeExporter commented 9 years ago

My goal with this approach was to be dumb, simple, and utterly straightforward. 
 I think I've accomplished this goal.  

(And while filesystems are convoluted and complicated, XML documents might be a 
good example of genuine trees.)

Original comment by wasserman.louis on 6 Sep 2011 at 2:25

GoogleCodeExporter commented 9 years ago

Heh, I considered mentioning XML but thought that you wouldn't be interested. 

An XML document is only a tree if you ignore ID/IDREF(S) or treat them as 
leaves.

Original comment by jim.andreou on 6 Sep 2011 at 4:06

GoogleCodeExporter commented 9 years ago

Maybe this is irrelevant because of some previous posting, but for me the 
following worked well:

public interface TreeClassAdapter<T> {
    Iterable<T> children(T node);
    T parent(T node);
    Class<T> nodeClass();
    boolean isLeaf(T node); // useful to distinguish between nodes which could possible have children but happen to have none (like empty directories) and proper leaves (like files), cf. javax.swing.tree.TreeModel
}

This way one can easily organize arbitrary objects in trees, without having to 
create a wrapper for each object (which is quite a PITA, since you need to 
switch back and forth often). Adapting objects to this interface is easy, for 
example for java.io.File it takes just a few lines.

Original comment by Maaarti...@gmail.com on 14 Sep 2011 at 4:00

GoogleCodeExporter commented 9 years ago

I have to admit that my reaction to that is, ewwwwwwwwwww.

I think the current interface serves most conceivable needs without 
overspecifying the tree format.

Original comment by wasserman.louis on 14 Sep 2011 at 4:10

GoogleCodeExporter commented 9 years ago

Interface TreeViewer is essentially just a Function<T,Iterable<T>>

Original comment by jvdne...@gmail.com on 14 Sep 2011 at 5:10

GoogleCodeExporter commented 9 years ago

Correct!  But calling it that would make the API much less clear and more 
awkward to use.  Doing it *properly,* like this, makes it more natural and 
extensible -- for example, the BinaryTreeViewer abstract class would be 
difficult or impossible to provide.

Original comment by wasserman.louis on 14 Sep 2011 at 5:49

GoogleCodeExporter commented 9 years ago

Fair enough. I understand the benefits of a specific interface. I'm just easily 
tempted to reduce plumbing because I know I might need a Function that does the 
same sooner or later anyhow. (For instance to do a concat(transform(...)) in a 
different context)

Also, occasionally, I find it useful/necessary to build a parallel tree.

Original comment by jvdne...@gmail.com on 14 Sep 2011 at 10:38

GoogleCodeExporter commented 9 years ago

Explain please what you mean by "parallel tree"?  (I can imagine several 
different meanings, so I'm interested in specific use cases.)

Original comment by wasserman.louis on 15 Sep 2011 at 2:07

GoogleCodeExporter commented 9 years ago

For instance, you have a tree of nodes in which each node only has references 
to its children. What if you need to process this tree in a way you need parent 
references? You could map every node of the tree to a new object with a 
reference to a parent of the same type and a reference to the original node. 
That would be a parallel tree.

I have needed this in the past and wrote the following code (which you might 
reasonably call ugly/bad code):

clarification: walk is preorder traversal; map is transform; mapcat is 
concat(transform(...))

{{{
public static <T> Iterable<T> walk(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren) {
   final Function<T, Iterable<? extends T>> fnWalk = new Function<T, Iterable<? extends T>>() {
      @Override
      public Iterable<? extends T> apply(final T from) {
         return walk(from, fnGetChildren, this);
      }
   };
   return walk(root, fnGetChildren, fnWalk);
}

private static <T> Iterable<T> walk(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren,
      final Function<? super T, ? extends Iterable<? extends T>> fnWalk) {
   return concat(singleton(root), mapcat(fnWalk, fnGetChildren.apply(root)));
}

public interface TreeNode<T> {
   public TreeNode<T> parent();
   public T item();
}

public static <T, TreeNodeType extends TreeNode<T>> Iterable<TreeNodeType> 
walkWithParents(final T root,
      final Function<? super T, ? extends Iterable<? extends T>> fnGetChildren,
      final Function2<? super TreeNodeType, ? super T, ? extends TreeNodeType> nodeFactory) {
   final Function<TreeNodeType, Iterable<TreeNodeType>> fnPairChildren = new Function<TreeNodeType, Iterable<TreeNodeType>>() {
      @Override
      public Iterable<TreeNodeType> apply(final TreeNodeType parent) {
         return map(Functions2.bind1st(nodeFactory, parent), fnGetChildren.apply(parent.item()));
      }
   };
   final Function<TreeNodeType, Iterable<TreeNodeType>> fnWalk = new Function<TreeNodeType, Iterable<TreeNodeType>>() {
      @Override
      public Iterable<TreeNodeType> apply(final TreeNodeType from) {
         return walk(from, fnPairChildren, this);
      }
   };
   return walk(nodeFactory.apply(null, root), fnPairChildren, fnWalk);
}
}}}

Original comment by jvdne...@gmail.com on 15 Sep 2011 at 3:56

GoogleCodeExporter commented 9 years ago

Implementing a traversal based on the comments here I stumbled upon something 
you might want to consider. I needed to keep track of the current depth during 
the traversal, so I used an Iterator based (rather that Iterable based) 
approach using a special TreeIterator:

public static <T> TreeIterator<T> postOrderView(final T root, final Function<? 
super T, ? extends Iterator<? extends T>> childrenFunction);

public interface TreeIterator<T> extends Iterator<T> {
    int depth();
}

Original comment by christop...@gmail.com on 5 Oct 2011 at 4:27

GoogleCodeExporter commented 9 years ago

Interesting.  I'm not sure that's a sufficiently common use case to get special 
treatment, but I'll look into it.

Original comment by wasserman.louis on 5 Oct 2011 at 5:26

GoogleCodeExporter commented 9 years ago

The most general approach is providing an iterator of paths (from root to each 
node), which could be viewed as an iterator of values (much like Jared's 
LinkedNodeList does). The best representation for these paths are persistent 
stacks, the whole cost is one object per node, and they are immutable. Just in 
case anyone really wants to generalize

Original comment by jim.andreou on 6 Oct 2011 at 12:19

GoogleCodeExporter commented 9 years ago

That could work.

Original comment by wasserman.louis on 6 Oct 2011 at 12:52

GoogleCodeExporter commented 9 years ago

Issue 703 has been merged into this issue.

Original comment by wasserman.louis on 15 Nov 2011 at 3:50

GoogleCodeExporter commented 9 years ago

Original comment by fry@google.com on 10 Dec 2011 at 3:38

Added labels: Package-Collect

GoogleCodeExporter commented 9 years ago

When can we expect TreeViewer and other Tree related Interfaces/Classes to be 
available ? I was browsing the 11.0.1 javadocs and could not locate these.
Thank you,

Original comment by anjan.ba...@gmail.com on 18 Jan 2012 at 4:30

GoogleCodeExporter commented 9 years ago

I believe we're testing some ideas internally, but this is just such a broadly 
applicable and common need that it's worth taking lots of time to experiment 
with different designs.

Original comment by wasserman.louis on 19 Jan 2012 at 1:38

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

Original comment by kevinb@google.com on 30 May 2012 at 7:43

Added labels: Type-Addition
Removed labels: Type-Enhancement

GoogleCodeExporter commented 9 years ago

[deleted comment]

GoogleCodeExporter commented 9 years ago

If you want to use Tree data structure and don't want to wait until it will be 
added to Guava I suggest using my implementation which can be found on github:
https://github.com/serso/common/tree/master/collections-tree/src/main/java/org/s
olovyev/common/collections/tree

The interfaces are self explaining. One can start from the Trees.java class.

Original comment by se.solov...@gmail.com on 2 Feb 2013 at 11:36

GoogleCodeExporter commented 9 years ago

Initial versions of TreeTraverser and BinaryTreeTraverser are slated for 15.0: 
https://code.google.com/p/guava-libraries/source/browse/guava/src/com/google/com
mon/collect/TreeTraverser.java

Original comment by lowas...@google.com on 9 Mar 2013 at 6:13

Changed state: Started
Added labels: Milestone-Release15

GoogleCodeExporter commented 9 years ago

FWIW, I think something like the EMF TreeIterator (see 
http://download.eclipse.org/modeling/emf/emf/javadoc/2.4.3/org/eclipse/emf/commo
n/util/TreeIterator.html) which is basically an Iterator with an additional 
prune() method is indeed very useful.

Original comment by knut.wan...@gmail.com on 9 Mar 2013 at 6:46

GoogleCodeExporter commented 9 years ago

If adding an Iterator subinterface with a prune() method, then I suggest naming 
the interface more generically than "TreeIterator". Pruning is simply saying "I 
don't want to traverse beyond the last node any further". I haven't delved into 
the committed code, but I suspect that the traversal will work for any DAG just 
fine. Actually, it looks like pre-order and breadth-first will work for any 
graph, cyclic or not, provided you don't continue to call next() ad infinitum.

Original comment by ray.a.co...@gmail.com on 12 Mar 2013 at 10:14

GoogleCodeExporter commented 9 years ago

My model has changed from the last time I responded here. This isn't a 
suggestion, merely throwing it out there. I have edges (user-defined, weights 
e.g.) as well, but I've elided that here and changed a few things as a result. 
This is minimal pseudo-code, for terseness.

    interface Walk<T> {
        T getFrom();  // convenience, first element of path
        T getTo()     // convenience, last element of path
        Iterable<T> getPath();
    }

    interface Traverser<T> extends Function<T, Iterable<Walk<T>>>

    class Traversers {
        static <T> Traverser<T> empty()
        static <T> Traverser<T> preOrder( Traverser<T> adjacency )
        // similar postOrder, breadthFirst, leaves, a few others
    }

In my use case, I need the entire path taken to reach each visited vertex. 
Storage isn't bad if you're clever (similar to lisp cons cell, as Dimitris 
Andreou mentioned). If you don't need that, replace Walk<T> with just T.

Traverser<T> is both how one defines an adjacency function (typically 
user-defined), and how pre/post/... traversals are defined. An adjacency 
function simply returns very short walks. In the language of your 
implementation, both children(T) and preOrderTraveral(T) return Iterable<T>.

I'm pretty sure your reaction will be that this over-generalizes, and I can't 
dispute that. It's in my nature. I could certainly see not using Function here, 
and using a specific method name for Traverser instead.

Original comment by ray.a.co...@gmail.com on 13 Mar 2013 at 4:10

GoogleCodeExporter commented 9 years ago

@ray: have you looked at the current implementation linked above?

(I claim that it's general enough to let you optionally keep track of the paths 
in the tree: you just write a TreeTraverser<Walk<T>> instead of a 
TreeTraverser<T>.)

I'm not sure what, if any, features we'll include to prune the traversal, but 
we might do something along the lines of doNotExpand(Predicate<T>).

Original comment by wasserman.louis on 13 Mar 2013 at 7:11

GoogleCodeExporter commented 9 years ago

@Louis

Thanks for pointing out the potential to use a TreeTraverser<Walk<T>>, I had 
not seen that possibility. That would also work with my actual Walk including 
edge objects. I'll have to take a closer look in light of that to see where the 
gaps are between the two, as I'd rather use Guava than my own. One difference 
is that a user would need to supply the root node as a trivial Walk, which is 
slightly cumbersome. I still like the fact that adjacency (children) and 
complex traversals are the same kind of thing, but I have to reexamine my use 
cases to see if that feature is actually useful.

The other kinds of traversals I have needed in the past are: depth-first (both 
out and back, each node visited twice), topological sort (taking an Iterable of 
nodes, not just a single root node), and leaves. To be really useful, a 
depth-first iterator has to be able to tell you which direction you're 
traversing.

Re: pruning. In my experience, whether or not to prune usually depends on your 
program state at that point in the traversal. For example, you might prune if 
exploration beyond that node is pointless because you've already found a lower 
cost result.

Original comment by ray.a.co...@gmail.com on 14 Mar 2013 at 2:26

GoogleCodeExporter commented 9 years ago

That is true, but we're not trying to be the be all and end all of tree 
traversals: if we make a wide variety of use cases simpler with a minimum of 
API complexity, we've done our job.

In particular, I don't want to track the full path of the traversal in every 
case because that would incur prohibitive memory costs in cases where the tree 
is implicit or not held in memory all at once, and I am uncertain that adding a 
whole new iteration API to deal with pruning helps enough additional users to 
merit the added complexity.

Original comment by lowas...@google.com on 14 Mar 2013 at 4:03

GoogleCodeExporter commented 9 years ago

To me, the useful thing about TreeTraverser is being able to iterate over the 
elements of the tree as if it were flat, when that's what you want to do. If 
you need to deal with the tree in a more tree-like way, do that. But I don't 
really like the idea of trying to combine treating the tree as flat and as a 
tree by introducing something cumbersome like prune() and requiring you to 
iterate with a special kind of Iterator, etc. The idea of using a Predicate to 
determine whether or not the subtree of a node should be iterated seems ok to 
me, but nothing more than that.

Original comment by cgdec...@gmail.com on 14 Mar 2013 at 6:40

GoogleCodeExporter commented 9 years ago

re: pruning...

Extending Iterator is ugly. I've done that in the past, and it's not pretty, 
and even uglier with Iterable. Returning an Iterable<Something<T>>, where 
Something has a prune() method, would be better. Not good, just better. But 
then iteration isn't nearly as clean for the common use case. That argues for 
more methods, pruning vs. normal pre-order and breadth-first. And then your API 
is uglier. I need pruning, and I'm not even sure that's worth it. I can always 
write my own using a given TreeTraverser.

If using a stateless Predicate for limited pruning, I would suggest creating a 
new TreeTraverser, similar to Joiner.skipNulls(). I think that's what you 
suggested with doNotExpand(Predicate), but I'm not sure.

Original comment by ray.a.co...@gmail.com on 15 Mar 2013 at 8:02

GoogleCodeExporter commented 9 years ago

> If using a stateless Predicate for limited pruning, I would suggest creating 
a new TreeTraverser, similar to Joiner.skipNulls(). I think that's what you 
suggested with doNotExpand(Predicate), but I'm not sure.

Yes, exactly.

Original comment by cgdec...@gmail.com on 15 Mar 2013 at 8:12

GoogleCodeExporter commented 9 years ago

re: stuff other than pruning...

I didn't mean to imply that TreeTraverser should keep track of paths from root. 
I agree that TreeTraverser<Walk<T>> would do that, and it's just my problem to 
define children(Walk<T>) correctly.

I wasn't really pushing those other use cases (topo sort, ...), just bringing 
them up because your project has a good history of trying to solve other 
people's problems when it makes sense for Guava. Consider it a disclosure of 
how one of your users uses traversals. Any of these can be added by users in 
their own code easily enough.

Original comment by ray.a.co...@gmail.com on 15 Mar 2013 at 8:19

maidh91 / guava-libraries

Support for trees and traversals #174