ciaranm / glasgow-subgraph-solver

A solver for subgraph isomorphism problems, based upon a series of papers by subsets of McCreesh, Prosser, and Trimble.
MIT License
62 stars 23 forks source link

std::bad_array_new_length errors need to convey more information #20

Open arungiridhar opened 9 months ago

arungiridhar commented 9 months ago

Thank you for writing this useful program.

For graphs with more than 40K vertices, the subgraph solver and the clique solver both give Error: std::bad_array_new_length at random times.

E.g, searching for triangles in a 205x205 grid graph:

$ path/to/glasgow_subgraph_solver --format lad c3.txt trial.txt 
pattern_file = c3.txt
target_file = trial.txt
pattern_properties =
pattern_vertices = 3
pattern_directed_edges = 6
target_properties =
target_vertices = 42025
target_directed_edges = 168100
Error: std::bad_array_new_length

But searching for quadrilaterals on the same graph works, so it's not a simple problem with insufficient memory. In any case, memory usage was very reasonable and there was no danger of running out of RAM.

$ path/to/glasgow_subgraph_solver --format lad c4.txt trial.txt 
pattern_file = c4.txt
target_file = trial.txt
pattern_properties =
pattern_vertices = 4
pattern_directed_edges = 8
target_properties =
target_vertices = 42025
target_directed_edges = 168100
status = true
nodes = 4
propagations = 4
mapping = (0 -> 34238) (1 -> 34239) (2 -> 34444) (3 -> 34443) 
runtime = 10300

The file trial.txt is attached. It's just a 205x205 grid graph that wraps. trial.txt

ciaranm commented 9 months ago

The immediate issue is inside the clique solver, src/clique.cc:118:

space = new int[size * (size + 1) * 2];

I am guessing that this might be an int overflow issue. If I change this to:

space = new int[static_cast<unsigned long long>(size) * (size + 1) * 2];

then it seems to work. However, this is going to allocate a loooot of memory for large graphs. The actual amount of memory needed here is more like "graph size * however deep the search will go", which in your case will be at most four recursive calls. It's probably worth implementing some way of reducing this, or switching to not doing static memory allocation if the graphs are large.

I think the only place affected is the clique solver. However, by default the subgraph solver will notice that your pattern graph is a clique, and will switch to clique-solving if it is. As a short-term workaround, I believe that if you do --no-clique-detection then it should probably run, albeit slower.

arungiridhar commented 9 months ago

Thank you. Casting it to 64-bit unsigned worked for me.