Closed weiT1993 closed 2 months ago
That sounds pretty bad. I haven’t reproduced the issue yet, but to give you a reasoning:
Until we fix it in 0.14.3 or 0.15.0 please use https://www.rustworkx.org/apiref/rustworkx.strongly_connected_components.html and pick a component with more than one node. That method is correct and should always give you a cycle. I will probably fix find_cycle
using the SCC implementation
I verified the problem. It is with the arbitrarily chosen starting node. If you pick this one (I plotted the graph to see the cycle):
cycle = rx.digraph_find_cycle(graph, graph.find_node_by_weight(81))
print(cycle)
You get:
EdgeList[(81, 82), (82, 83), (83, 69), (69, 70), (70, 76), (76, 77), (77, 78), (78, 80), (80, 81)]
I'll try to change the behaviour of rx.is_directed_acyclic_graph(graph)
. It should always return a cycle if it exists and node is specified. I agree it is a bit misleading. But probably the original intention was to pass a node you already knew was in a cycle as an argument
Also, completely unrelated to the bug but you would probably like add_edges_from_no_data
. You don't need to create the new edge list you created, just call graph.add_edges_from_no_data(edges)
instead of graph.add_edges(edges)
if you don't have weights.
Thanks for looking into the bug. I am confused about your reasoning for the bug. The DFS cycle detection algorithm should find a cycle regardless of which arbitrary node was picked when the given graph is connected and only has one component. In my example, the given graph is a single connected graph.
If you are saying the DFS algorithm attempts to find a cycle that has to contain the random starting node if none was given, that would explain the bug. But it is a misleading design choice. For example, if the input is a huge graph with a tiny cycle, this algorithm would have a very small chance of returning the cycle unless the user already knows where the tiny cycle is (in which case, what's the point of running this function anyway?). Was this intentional?
I think the expected behavior for rx.digraph_find_cycle(graph)
should be that it always finds a cycle (if there is one) when no starting node is specified. It should find a cycle containing the starting node when one is given.
And I do not think there is anything wrong with rx.is_directed_acyclic_graph(graph)
. It seems to correctly decide whether the graph is a DAG.
Thanks for looking into the bug. I am confused about your reasoning for the bug. The DFS cycle detection algorithm should find a cycle regardless of which arbitrary node was picked when the given graph is connected and only has one component. In my example, the given graph is a single connected graph.
If you are saying the DFS algorithm attempts to find a cycle that has to contain the random starting node if none was given, that would explain the bug. But it is a misleading design choice. For example, if the input is a huge graph with a tiny cycle, this algorithm would have a very small chance of returning the cycle unless the user already knows where the tiny cycle is (in which case, what's the point of running this function anyway?). Was this intentional?
I think the expected behavior for
rx.digraph_find_cycle(graph)
should be that it always finds a cycle (if there is one) when no starting node is specified. It should find a cycle containing the starting node when one is given.And I do not think there is anything wrong with
rx.is_directed_acyclic_graph(graph)
. It seems to correctly decide whether the graph is a DAG.
In short, yes, it is a misnomer. A better name for the function would be digraph_find_cycle_containing_node
.
Also, is-directed_acyclic_graph
has an independent implementation that behaves as you expected.
So currently, if I want to find a cycle in the graph without knowing which node is in it, I need to write a wrapper to make sure I visit all the nodes? Do you plan to support it?
So currently, if I want to find a cycle in the graph without knowing which node is in it, I need to write a wrapper to make sure I visit all the nodes? Do you plan to support it?
We will fix it in 0.15 to make it smarter and always return a cycle if it exists.
If you need an urgent workaround I suggest checking the strongly connected components. Any component with more than one element is a cycle
Information
What is the current behavior?
rx.is_directed_acyclic_graph(graph)
returns False butrx.digraph_find_cycle(graph)
fails to find the cycle in the graph.What is the expected behavior?
When
rx.is_directed_acyclic_graph(graph)
returns False,rx.digraph_find_cycle(graph)
should find a cycle in the graph.Steps to reproduce the problem
Running the following script several times. It quite often reproduces the bug. The graph is always correctly identified as not a DAG, but quite often
rx.digraph_find_cycle(graph)
does not find a cycle.Quite often the script produces: