Open tlarock opened 8 months ago
I checked https://github.com/JuliaGraphs/Graphs.jl/blob/d1081b02c2a2d50ac0e4e95b1f84f17259b719d5/src/Experimental/vf2.jl#L85 but I didn't see a recursive call, so I'm a bit surprised. Can you spot one?
Graph isomorphism is a difficult problem with no known polynomial algorithm. I don't know the details of this implementation but I wouldn't be surprised if it crashed on large graphs. However, for lack of recursion, I would rather expect an OutOfMemoryError or something similar, so this puzzles me
Thanks for taking a look. Checking again I think actually the stack showing vf2check_feasibility
last might be a red herring. The algorithm from the cited paper is recursive and it seems the recursion is implemented in the function vf2match
(here for example). vf2check_feasibility
is called before each recursive call so it could be a coincidence that it tends to show up last on the stack, or a quirk of StackOverflowError
reporting in Julia (total speculation).
The helper function vf2match!
is recursive.
vf2
does nothing more than trying out all permutations of vertices with some pruning rules to speed up things. So the general complexity is roughly in the order of O(num_vertices!)
. And the recursion depth is indeed O(num_vertices)
Usually that algorithm would time out much earlier, so a StackOverflow is something one would rarely see. But because you are using the same graph twice, it will find matches v1 ~ w1, v2 ~ w2~, ...
right from the start and will indeed exhaust the stack.
I am not sure if we need to do anything here, as the case is indeed very rare - we could convert it to an iterative implementation, but I would suggest, we would only do if we need that for something else like a parallel version of the algorithm, as otherwise it will mostly serve to make the code harder to understand.
What we could do, is to add a quick note to the docstring, saying that it is recursive.
If you indeed wanna try using this algorithm for such large graphs. you could see if there is something in the Julia documentation on increasing the stack size. You could also try to use a smaller eltype than Int
for the graphs - I think UInt16
in your case, but I am not sure how much stack memory you will save by this, as most of the allocations are on the heap anyways.
Here they describe a way to set the stack size for a Task, and run the computation in said Task. Perhaps you could also try that? If it works, I wonder, if we could simply wrap this and all other recursive algorithms in tasks, where we estimate the stack size from the input.
Thanks for this @simonschoelly. After a quick test I can confirm that setting the stack explicitly as in the stackoverflow solution you sent can work, I am able to get output for ER graphs up to 100k nodes.
It would definitely be helpful to add a note to the docstring that highlights the issue and points to this potential solution. And perhaps the docstrings of any other recursive functions that could run into this situation.
The task wrapper solution would be great from the user perspective since it would widen the range of cases for which the function just works. My worry would be that doing this in the background could lead to some rather opaque memory explosion issues based on the "as long as there is enough memory available" parenthetical from the stackoverflow answer. But that is a bit beyond my expertise.
Description of bug May be more of a limitation than a bug, especially given the experimental status of this piece of code.
I am running into
StackOverflowError
s when comparing graphs usingGraphs.Experimental.has_isomorph
. The last call on the stack isvf2check_feasibility
.The error seems to be related to the number of nodes in the graphs (see next section). So I am wondering if there is an expected limitation to the number of nodes (or some other quantity) that this implementation can handle?
How to reproduce The code leading up to the errors in my case is a bit complex, but I have been able to reproduce by running the function on Erdos-Renyi graphs with increasing number of nodes:
Note that I chose the values around 15k because that is where it is failing in my use case (see additional context). However in the ER case it fails at 30k nodes:
I checked with constant number of edges as well and see the same pattern, so the issue appears to be related to the number of nodes specifically.
Expected behavior I expect the function to return a Bool as normal.
Actual behavior The function does not finish and a
StackOverflowError
is thrown.Code demonstrating bug See "How to reproduce"
Version information
versioninfo()
] status Graphs
Additional context As a bit of background, the graphs I am comparing represent pairwise projections of two hypergraphs, basically node co-occurence graphs. It is quite likely that the graphs are indeed isomorphic. I am also using an
edge_relation
function to match edge weights.Here are node/edge counts for some passing and failing cases in my code. Note that in the two passing cases (first and last), the number of nodes is smaller than 15k, and in all of the failing cases the number of nodes is larger than 15k. This is different than the 30k found for erdos-renyi, but qualitatively the issue appears to be the same. And since it seems to be exactly half, perhaps it has to do with symmetry, although in both cases I expect the graphs to be undirected.