scheinerman / SimpleGraphAlgorithms.jl

Additional algorithms for the `SimpleGraphs` module that rely on integer programming
MIT License
11 stars 0 forks source link

`is_iso` is noisy and slow on small graphs #12

Closed LilithHafner closed 1 year ago

LilithHafner commented 1 year ago

I'm running this benchmarking workflow:

function g(n, m)
    isos = 0
    graphs = SimpleGraphs.UndirectedGraph{Int}[]
    for i in 1:m
        x = rand(Bool, n,n); x .⊻= transpose(x);
        a = SimpleGraphs.UndirectedGraph(x)
        for b in graphs
            if redirect_stdout(() -> SimpleGraphAlgorithms.is_iso(a, b), devnull)
                isos += 1
                @goto next
            end
        end
        push!(graphs, a)
        @label next
    end
    isos
end

Without the redirect_stdout, it produces quite a lot of output. With, it is still 100x slower than an equivalent workflow using Graphs:

function f(n, m)
    isos = 0
    graphs = SimpleGraph{Int}[]
    for i in 1:m
        a = erdos_renyi(n, rand(Distributions.Binomial(n*(n-1)÷2, .5)))
        for b in graphs
            if Graphs.Experimental.has_isomorph(a, b)
                isos += 1
                @goto next
            end
        end
        push!(graphs, a)
        @label next
    end
    isos
end
julia> @time g(6,300)
  3.900605 seconds (944.82 k allocations: 199.779 MiB, 1.32% gc time)
193

julia> @time f(6,300)
  0.034219 seconds (45.25 k allocations: 3.477 MiB, 59.84% gc time)
191
scheinerman commented 1 year ago

@LilithHafner : The "noise" is coming from the solver. Using the Gurobi solver on larger graphs, the difference is narrowed:

julia> @time g(1000,100)
  8.096933 seconds (1.39 M allocations: 2.966 GiB, 7.65% gc time)
0

julia> @time f(1000,100)
  6.768485 seconds (525.15 k allocations: 823.368 MiB, 9.97% gc time)
0
LilithHafner commented 1 year ago

Is there an option for the user to get a quiet solve without redirecting stdout?

The reason the difference appears to be narrower for larger graphs is because the vast majority of the runtime is spent constructing graphs rather than checking for isomorphisms (verified with profiler). Constructing a 1000x1000 graph is slow, while it is very easy to conclude that two random 1000x1000 graphs are non-isomorphic (checking their maximum 3 degrees is typically sufficient).

scheinerman commented 1 year ago

@LilithHafner Thanks for the feedback. Let's think about devising a better test. It may well still be the case that the method in Graphs is better, but a bit more testing may show us what's up. Quick tests of simple invariants is a good way to rule out isomorphisms, but they're not definitive.

As for the "noise" from the solvers, there are ways to reduce their output. In my ChooseOptimizer package, I can use set_solver_verbose(false) to suppress the output. Unfortunately, different solvers do this in different ways. For Cbc one sets the LogLevel parameter to 0. For Gurobi, it's OutputFlag set to 0. For GLPK, it's msg_lev set to false. It's frustrating that there isn't a consistent way to do this.

scheinerman commented 1 year ago

Hello again @LilithHafner . I wrote some testing software for this:

using Graphs, SimpleGraphs, SimpleGraphAlgorithms, SimpleGraphConverter
using ChooseOptimizer, Random, Gurobi

"""
    random_relabel(G::UndirectedGraph{T})::UndirectedGraph where {T}

Create a copy of `G` but rename the vertices using the labels `1` through `n`
placed at random. The resulting graph is isomorphic to `G`.
"""
function random_relabel(G::UndirectedGraph{T})::UndirectedGraph where {T}
    n = NV(G)
    new_names = randperm(n)
    d = Dict{T,Int}()
    v_list = vlist(G)

    for i = 1:n
        v = v_list[i]
        d[v] = new_names[i]
    end
    GG = relabel(G, d)
    return GG
end

set_solver(Gurobi)          # Change to another solver if you wish, e.g., GLPK
set_solver_verbose(false)

"""
    example_yes(n::Int)

Create a pair of 3-regular graphs on `n` vertices 
that are isomorphic.
"""
function example_yes(n::Int)
    G1 = RandomRegular(n, 3)
    G2 = random_relabel(G1)
    return G1, G2
end

"""
    example_no(n::Int)

Create a pair of 3-regular graphs on `n` vertices 
that are (with high probability) not isomorphic
"""
function example_no(n::Int)
    G1 = RandomRegular(n, 3)
    G2 = RandomRegular(n, 3)
    return G1, G2
end

"""
    tester(n::Int, yesno::Bool = true)

Create a pair of 3-regular graphs on `n` vertices and test if they
are isomorphic using methods from `SimpleGraphAlgorithms`
and `Graphs.Experimental`. 

If `yesno` is `true`, then create isomorphic graphs. If `false`,
create two different graphs. 
"""
function tester(n::Int, yesno::Bool = true)
    G1 = IntGraph()
    G2 = IntGraph()
    if yesno
        G1, G2 = example_yes(n)
    else
        G1, G2 = example_no(n)
    end

    g1 = SimpleGraph(G1)
    g2 = SimpleGraph(G2)

    @info "SimpleGraphAlgorithms.is_iso"
    @time a = is_iso(G1, G2)
    @info "Graphs.Experimental.has_isomorph"
    @time b = Graphs.Experimental.has_isomorph(g1, g2)
    return a, b
end

For modest sized graphs, the function in Graphs.Experimental is faster. For example:

julia> tester(50,true)
[ Info: SimpleGraphAlgorithms.is_iso
Set parameter Username
Academic license - for non-commercial use only - expires 2023-03-31
  0.066436 seconds (136.30 k allocations: 15.669 MiB, 31.31% gc time, 14.72% compilation time)
[ Info: Graphs.Experimental.has_isomorph
  0.003820 seconds (11 allocations: 4.016 KiB)
(true, true)

julia> tester(50,false)
[ Info: SimpleGraphAlgorithms.is_iso
  0.010417 seconds (6.05 k allocations: 6.790 MiB)
[ Info: Graphs.Experimental.has_isomorph
  0.002626 seconds (11 allocations: 4.016 KiB)
(false, false)

For larger graphs, though, the situation changes:

julia> tester(250,true)
[ Info: SimpleGraphAlgorithms.is_iso
Set parameter Username
Academic license - for non-commercial use only - expires 2023-03-31
  1.632661 seconds (2.62 M allocations: 296.553 MiB, 2.21% gc time)
[ Info: Graphs.Experimental.has_isomorph
 78.410639 seconds (11 allocations: 16.641 KiB)
(true, true)

julia> tester(250,false)
[ Info: SimpleGraphAlgorithms.is_iso
  0.452340 seconds (54.42 k allocations: 93.334 MiB, 3.02% gc time, 1.66% compilation time)
[ Info: Graphs.Experimental.has_isomorph
  9.426613 seconds (11 allocations: 16.641 KiB)
(false, false)

Note that there's a good deal of variation in the run times, and I'm using Gurobi which is one of the best solvers around.

scheinerman commented 1 year ago

@LilithHafner I think we can definitely say that has_isomorph is faster than is_iso for small graphs. The noise is "not my fault" ... it comes from the solvers.

I did some more testing. I created a list of 100 random three-regular graphs with 50 vertices each. These were in ten groups of ten. The graphs in each group were isomorphic to each other, but graphs in different groups were not. I then compared every graph in the list to every other one using the two methods. They were pretty close but is_iso ran faster: 7.5 seconds vs 12 seconds for has_isomorph. I can send you the code if you're interested.

One of the reasons is that whenever I test a pair of graphs for isomorphism, I save information with those graphs that will make subsequent testing faster. If we did a race between the methods on a single pair of 50-vertex graphs, I think has_isomorph would be speedier.

I don't think that any further resolution of this item is warranted. Different methods have different performance. So I'm going to close this Issue. Thanks for an interesting observation.