Computing the matching polynomial

ca0b67cc-9d10-44c6-be51-5d9e2cdee96a commented 9 years ago

Disclaimer: implementing a cache whose keys are (isomorphism type of) graphs and values are some invariant is a bad idea. See comment:1 below.

Original suggestion:

At this point I am not in the right state to include this improvement in Sage but I am leaving it here for the future or if someone else is bored.

Much in the same way as with the chromatic polynomial (as discussed on #14529) one can improve the computation of the matching polynomial with use of caching.

sage: G = graphs.DodecahedralGraph()
sage: %timeit G.matching_polynomial()
100 loops, best of 3: 1.97 ms per loop
sage: %timeit matchpoly(G)
1 loops, best of 3: 723 µs per loop

The implementation is the usual one though I suppose it makes sense to use this directly in the Cython file.

def matchpoly(G):

    global cache 

    s = tuple(sorted(G.canonical_label().edges(labels=False)))

    if s in cache:
        return cache[s]
    if not G.is_connected():
        return prod([matchpoly(C) for C in G.connected_components_subgraphs()])

    R = ZZ['x']
    x = R.gen()

    if G.size() == 0:
        return x^G.order()

    u,v = G.edge_iterator(labels=False).next()

    H = G.copy()

    H.delete_edge(u,v)
    p = matchpoly(H)

    H.delete_vertices([u,v])

    ret = p - matchpoly(H)

    cache[s] = ret 

    return ret

CC: @nathanncohen @videlec

Component: graph theory

Reviewer: Vincent Delecroix

Issue created by migration from https://trac.sagemath.org/ticket/18604

videlec commented 9 years ago

comment:1

Hi,

Your proposition is far from being a computational improvement. Caching the values of all possible graphs results in using a lot of memory. Moreover, for most graphs your proposition would be slower (because you call for canonical labels).

For very symmetric graph of a reasonable size your function is already slower

sage: G = graphs.GridGraph((5,7))
sage: %time p = G.matching_polynomial()
CPU times: user 15.5 s, sys: 0 ns, total: 15.5 s
sage: %time p = matchpoly(G)
CPU times: user 18.6 s, sys: 4 ms, total: 18.6 s
Wall time: 18.6 s

And it is infinitely slower on random graphs

sage: G = graphs.RandomGNP(28, 5/28)
sage: %time p = G.matching_polynomial()
CPU times: user 3.36 s, sys: 0 ns, total: 3.36 s
Wall time: 3.36 s
sage: %runfile test.py
sage: %time p = matchpoly(G)
CPU times: user 2min 14s, sys: 224 ms, total: 2min 14s
Wall time: 2min 14s

Of course the next call would be faster but I do not know anybody who will call twice an expensive function.

I propose to just close this ticket as a won't fix and emphasize in the description that caching is of no help here! If you care about the computation of matching polynomial you can have a look at #17921.

Vincent

ca0b67cc-9d10-44c6-be51-5d9e2cdee96a commented 9 years ago

comment:2

Hey there,

thanks for having a look! I am really surprised about this outcome. The same idea is used for the Tutte polynomial and is a solid improvement. As tested on the Dodecahderal graph it seemed like it makes sense. I guess cache efficiency comes from edge contractions not vertex deletions?

That said I suggest we close this down (can't find the option on my side?)

Jernej

videlec commented 9 years ago

Reviewer: Vincent Delecroix

videlec commented 9 years ago

Description changed:

--- 
+++ 
@@ -1,8 +1,10 @@
-Hey there!
+**Disclaimer**: implementing a cache whose keys are (isomorphism type of) graphs and values are some invariant is a *bad* idea. See [comment:1](#comment%3A1) below.

-At this point I am not in the right state to include this improvement in Sage but I am leaving it here for the future or if someone else is bored.
+Original suggestion:

-Much in the same way as with the chromatic polynomial (as discussed on #14529) one can improve the computation of the matching polynomial with use of caching.
+    At this point I am not in the right state to include this improvement in Sage but I am leaving it here for the future or if someone else is bored.
+
+    Much in the same way as with the chromatic polynomial (as discussed on #14529) one can improve the computation of the matching polynomial with use of caching.

sage: G = graphs.DodecahedralGraph() @@ -11,8 +13,7 @@ sage: %timeit matchpoly(G) 1 loops, best of 3: 723 µs per loop

-
-The implementation is the usual one though I suppose it makes sense to use this directly in the Cython file.
+    The implementation is the usual one though I suppose it makes sense to use this directly in the Cython file.

def matchpoly(G): @@ -47,5 +48,3 @@

 return ret

-
-

videlec commented 9 years ago

comment:3

Hello,

Replying to @sagetrac-azi:

thanks for having a look! I am really surprised about this outcome. The same idea is used for the Tutte polynomial and is a solid improvement. As tested on the Dodecahderal graph it seemed like it makes sense. I guess cache efficiency comes from edge contractions not vertex deletions?

What do you mean by "solid improvement"? Are you sure it is faster to use caching there? Did you do serious benchmarking with and without? I am very curious about that.

That said I suggest we close this down (can't find the option on my side?)

All right.

Vincent

videlec commented 9 years ago

comment:4

More precisely this is what I get (with caching):

sage: %time p = G.tutte_polynomial()
CPU times: user 184 ms, sys: 8 ms, total: 192 ms
Wall time: 186 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 16 ms, sys: 0 ns, total: 16 ms
Wall time: 16 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 80 ms, sys: 4 ms, total: 84 ms
Wall time: 76.7 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 84 ms, sys: 8 ms, total: 92 ms
Wall time: 83.8 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 656 ms, sys: 4 ms, total: 660 ms
Wall time: 652 ms
sage: %time p = G.tutte_polynomial()
CPU times: user 740 ms, sys: 0 ns, total: 740 ms
Wall time: 734 ms

without caching (i.e. redefining def _cached(func) as return func):

sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 180 ms, sys: 24 ms, total: 204 ms
Wall time: 194 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 12 ms, sys: 0 ns, total: 12 ms
Wall time: 13.5 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 16 ms, sys: 4 ms, total: 20 ms
Wall time: 19.5 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 28 ms, sys: 0 ns, total: 28 ms
Wall time: 27.1 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 80 ms, sys: 4 ms, total: 84 ms
Wall time: 75.8 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 8 ms, sys: 0 ns, total: 8 ms
Wall time: 7.01 ms
sage: G = graphs.RandomGNP(14, 0.2)
sage: %time p = G.tutte_polynomial()
CPU times: user 44 ms, sys: 0 ns, total: 44 ms
Wall time: 39.1 ms

So I wonder where is your "solid improvement"!?

Vincent

ca0b67cc-9d10-44c6-be51-5d9e2cdee96a commented 9 years ago

comment:5

Replying to @videlec:

Hello,

Replying to @sagetrac-azi:

thanks for having a look! I am really surprised about this outcome. The same idea is used for the Tutte polynomial and is a solid improvement. As tested on the Dodecahderal graph it seemed like it makes sense. I guess cache efficiency comes from edge contractions not vertex deletions?

What do you mean by "solid improvement"? Are you sure it is faster to use caching there? Did you do serious benchmarking with and without? I am very curious about that.

Yep the improvement in such cases is well know. There is actually even a paper that describes this idea. See the paper cited here http://homepages.ecs.vuw.ac.nz/~djp/tutte/

That said I suggest we close this down (can't find the option on my side?)

All right.

Vincent

videlec commented 9 years ago

comment:6

I see... Thanks for the link. Apparently, Sage is not smart enough to make a difference here ;-)

Vincent

ca0b67cc-9d10-44c6-be51-5d9e2cdee96a commented 9 years ago

comment:7

Yep. Perhaps one of the bottlenecks is that canonical forms in Sage is (currently) written in Python and the algorithm itself is not the most efficient (as far as I've compared with say nauty)

videlec commented 9 years ago

comment:8

So perhaps this ticket would still makes sense?

ca0b67cc-9d10-44c6-be51-5d9e2cdee96a commented 9 years ago

comment:9

In some sense it does but one would really have to be careful in designing this properly.

You argued that on random graphs on 14 vertices is less efficient and if we take that as our base benchmark then it may not be worth it.

In general I think it can be made in a worthwhile improvement especially for larger graphs and provided that we use bliss or nauty.

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:10

Does this ticket still needs a review? #17921 also claims to make this routine faster.

rwst commented 9 years ago

comment:11

The algorithm in #17921 has a speed factor of >1000x for the bigger examples given here, so this is a complete different game. There is the original Sage algorithm if one needs a check.

sagemath / sage

Computing the matching polynomial #18604