google / guava

Google core libraries for Java
Apache License 2.0
49.93k stars 10.82k forks source link

Feature Request: Tree implementation in com.google.common.graph #2411

Open jbduncan opened 8 years ago

jbduncan commented 8 years ago

I'd like to request the addition of a Tree data structure, which would be similar to a DirectedGraph but which has a single root, disallows cycles and self-looping edges, and is fully connected.

My use case for such a feature would be to model a sequence of actions, which may branch depending on some criteria.

I'm aware that such a data structure already exists in JUNG 2 (and this is probably the solution I'll use in the meantime), but I believe it would be something that users of the com.google.common.graph package, in a future stable version of Guava, would appreciate.

lowasser commented 8 years ago

Can you give some more details about how modeling this action sequence with a graph helps you? What graph utilities would you use with this data structure?

On Sat, Mar 5, 2016, 2:26 PM Jonathan Bluett-Duncan < notifications@github.com> wrote:

I'd like to request the addition of a Tree http://mathworld.wolfram.com/Tree.html data structure, which would be similar to a DirectedGraph but which has a single root, disallows cycles and self-looping edges, and is fully connected.

My use case for such a feature would be to model a sequence of actions, which may branch depending on some criteria.

I'm aware that such a data structure already exists in JUNG 2 (and this is probably the solution I'll use in the meantime), but I believe it would be something that users of the com.google.common.graph package, in a future stable version of Guava, would appreciate.

— Reply to this email directly or view it on GitHub https://github.com/google/guava/issues/2411.

lowasser commented 8 years ago

Even better - can you show us a sample of what your code would look like if you were using the API you have in mind?

On Sat, Mar 5, 2016, 2:28 PM Louis Wasserman wasserman.louis@gmail.com wrote:

Can you give some more details about how modeling this action sequence with a graph helps you? What graph utilities would you use with this data structure?

On Sat, Mar 5, 2016, 2:26 PM Jonathan Bluett-Duncan < notifications@github.com> wrote:

I'd like to request the addition of a Tree http://mathworld.wolfram.com/Tree.html data structure, which would be similar to a DirectedGraph but which has a single root, disallows cycles and self-looping edges, and is fully connected.

My use case for such a feature would be to model a sequence of actions, which may branch depending on some criteria.

I'm aware that such a data structure already exists in JUNG 2 (and this is probably the solution I'll use in the meantime), but I believe it would be something that users of the com.google.common.graph package, in a future stable version of Guava, would appreciate.

— Reply to this email directly or view it on GitHub https://github.com/google/guava/issues/2411.

kluever commented 8 years ago

I think @jrtom has some plans around this, but he can clarify...

jrtom commented 8 years ago

Yes, I do have plans to support tree topologies in common.graph. Right now it's an open question whether it would be represented as its own type, or as a specialized implementation class, or solely in terms of constraints that will be specified as part of the graph's GraphConfig. (We're trying to keep the type system from exploding, but there are certain capabilities, like asking for the root node, that wouldn't make sense to have on the Graph interface but that seem useful for trees.)

Since you bring up JUNG, which I also own: FYI, in parallel with the common.graph work, I'm in the process of releasing JUNG 2.1 (it's on GitHub already as http://github.com/jrtom/jung), which replaces commons-collections with Guava. I plan to release JUNG 3.0 in a few months with the core graph API replaced with that of common.graph.

jbduncan commented 8 years ago

That sounds great @jrtom! Many thanks for your response. I realise trees are probably not going to appear in common.graph any time soon, but nonetheless it's exciting to hear what you have in mind for common.graph and JUNG.

@lowasser, I hope to get back to you soon with an answer, I'm just waiting for a confirmation that I can talk about my use case in detail here.

jbduncan commented 8 years ago

Also, thanks for fixing the typo in the issue title. :)

jbduncan commented 8 years ago

Hi @lowasser. I'm struggling to explain my use case very well and succintly, so I want to apologise in advance if you struggle with the wall-of-text below and/or if I've not described things well enough for you.

I'm doing my undergraduate dissertation, where I'm writing a tool which does logical & syntax-related checks on a subset of SBVR, a software modelling standard like UML which uses "Structured English" statements instead of diagrams.

For my dissertation, these statements describe the ordering in which "entities" in a hypothetical concurrent software system receive "messages" from each other. The most basic statement has the format

entity1 receives msg1 precedes entity2 receives msg2

which describes how, in this system, entity1 must receive msg1 before entity2 receives msg2. For my work, it doesn't matter where msg1 and msg2 come from.

I'm thinking of using a Tree to model the ordering between these messages rather than, say, a List. This is because in a more complex system, an entity (e.g. entity1) may expect to receive one message out of a choice of 2 or more (e.g. msg1 | msg2), and the message(s) it expects afterwards may change depending on whether it received msg1 or msg2. In SBVR, this might be modelled as

entity1 receives msg1 or entity1 receives msg2

To give a code example, if I had an SBVR statement like this:

John receives msg1 precedes George receives msg2 or George receives msg3

then I might model it as a Tree like so.

Tree messageOrderingTree = new Tree<>();
EntityWithMessage johnMsg1 = new EntityWithMessage("John", "msg1");
EntityWithMessage georgeMsg2 = new EntityWithMessage("George", "msg2");
EntityWithMessage georgeMsg3 = new EntityWithMessage("George", "msg3");
messageOrderingTree.addEdge(new Object(), johnMsg1, georgeMsg2);
messageOrderingTree.addEdge(new Object(), johnMsg1, georgeMsg3);
System.out.println(messageOrderingTree); 
// prints the following:
//           John -> msg1
//              (root)
//               /  \
//              /    \
// George -> msg2    George -> msg3

I hope this gives you a better understanding of what I'm trying to do.

jrtom commented 8 years ago

@jbduncan, a couple of things about your last post.

First, I'm not actually sure that using a graph (or tree) would actually help you much here. It doesn't sound like you're using, or interested in, much (if any) of the topological properties of the graph. (I am speaking as someone who likes using graph models for everything. ;) )

Second, you can already use common.graph to build your trees; we just don't (yet) have built-in support for guaranteeing that the Graph you build is actually tree-shaped. When we do, it would most likely be in the form of throwing run-time exceptions if you tried to populate your graph in such a way that it's not tree-shaped.

Third, I consider it extremely unlikely that we'd provide a toString() representation for trees that looked anything like that. The current toString() representation tells you how things are connected, but it's not meant to be visual, and that wouldn't change for trees or other topologies. For that part, you're on your own. :)

lowasser commented 8 years ago

That was pretty much the direction I was going in -- I was going to argue for rolling a very simple tree type of your own, pretty much just class Tree { private Foo data; private List<Tree> children; }. Guava does have some limited facilities for working with this sort of tree in TreeTraverser.

jbduncan commented 8 years ago

@jrtom and @lowasser, thank you both very much for your constructive feedback.

@jrtom, I understand your argument for the toString() representation for trees, and I completely agree! I only meant to show a visual example of what my tree would've looked like in case I hadn't made it clear from my makeshift API. :)

Thanks for your simple suggestion @lowasser, I'll see how it goes with using it in my dissertation.

I'm not sure what's the proper thing to do regarding leaving this issue open or closed. I presume it'd be useful to keep open as a reminder of sorts for the Tree impl, but I'd be more than happy to let @jrtom decide the usefulness of it. :)

jbduncan commented 8 years ago

I'm currently using JUNG 2.1, but I recently found that Durian has tree utilities like TreeDef and TreeNode which seem to fit my use case better.

Just thought I'd bring it up, since Durian's API may prove to be a source of inspiration for a tree implementation in common.graph (or indeed the next version of JUNG).

liach commented 7 years ago

@jbduncan I do not think trees are directed. They are undirected graph having n vertexes and n-1 edges.

jrtom commented 7 years ago

@liach It depends on which definition you use for "tree"; it's not a settled question in graph theory.

Yes, some sources refer to trees more or less as you do (although "connected" is part of the definition), and refer to a directed graph with similar topological properties as an arborescence.

However, it seems pretty clear that what many people mean by "tree" is what you might call a "directed rooted tree", i.e., they think of a tree as a directed (weakly) connected acyclic graph in which all edges are both directed away from [1], and reachable from, the root. E.g., "(rooted) binary tree". In such cases people often refer to nodes of such a structure as being "leaves" or having "children", which implies directionality.

We'll certainly give careful thought to what terminology we use when we add capabilities like this, but ultimately we may have to use terms for which there is no strong consensus definition; that's just the way graph theory is in my experience.

[1] A term I've heard for such a graph in which edges are directed towards the root is "uptree".

liach commented 7 years ago

For the tree, will we have some sort of TreeBuilder which take undirected and directed cases into different accounts?

jrtom commented 7 years ago

@liach If we were going to support trees at all, yes, we probably would.

liach commented 7 years ago

@jbduncan

which describes how, in this system, entity1 must receive msg1 before entity2 receives msg2. For my work, it doesn't matter where msg1 and msg2 come from.

Let's say that entityX receive msgX is called eventX. Here I will list some possibilities:

  1. one event unlocks multiple events;
  2. one event needs multiple events as precedent.

If both are true, then you need a DAG. If only one of them is true, then you need a tree indeed. If both are false, then it is simply a chain in which you do not really need graphs.

jbduncan commented 7 years ago

@liach For my project, only 1. is true. Each and every "event" is preceded by one or zero other "events", but each "event" leads onto potentially many other branching "events". Therefore I believe a tree models my problem best, rather than a chain (e.g. list or iterable) or a DAG. :)

Edit: Also, only one "event" can be preceded by zero events. This would make it the 'root' of my 'tree'.

DuaneNielsen commented 7 years ago

Something like below would be what most people would think of as a "ordered tree" I think.

Viewed as a whole, a tree data structure is an ordered tree, generally with values attached to each node. Concretely, it is (if required to be non-empty):

A rooted tree with the "away from root" direction (a more narrow term is an "arborescence"), meaning: A directed graph, whose underlying undirected graph is a tree (any two vertices are connected by exactly one simple path), with a distinguished root (one vertex is designated as the root), which determines the direction on the edges (arrows point away from the root; given an edge, the node that the edge points from is called the parent and the node that the edge points to is called the child), together with:

an ordering on the child nodes of a given node, and a value (of some data type) at each node.

vorburger commented 3 months ago

I was looking for something like this today, and came up with this... I am using it to create a "tree structure" for my https://docs.enola.dev/models/, but haven't really tested it much - so just posting here in case this is ever useful to anyone else stumbling over this issue.