The Push-Relabel method for Maximum Flow

46c4a3ed-0524-4a7d-a465-68e95557bd76 commented 10 years ago

The Push-Relabel method is a method to compute the Maximum Flow in a graph. Research has shown that it is both asymptotically and practically faster than several other methods, including Ford-Fulkerson, and its much used implementation, Edmonds-Karp.

I have been busy with implementing this method for Sage and will be uploading it shortly to this ticket. With it, I will explain some design decisions that I made and provide the (in my opinion) extensive testcases I ran, and later some profiling results (after all tests have ran again, since I lost my profiling info of earlier runs).

Depends on #16467 Depends on #16470

CC: @sagetrac-Rudi @nathanncohen @dcoudert @sagetrac-borassi

Component: graph theory

Keywords: maximum flow push relabel

Branch/Commit: u/foosterhof/ticket/16471 @ 01276c0

Issue created by migration from https://trac.sagemath.org/ticket/16471

46c4a3ed-0524-4a7d-a465-68e95557bd76 commented 10 years ago

Branch: u/foosterhof/ticket/16471

46c4a3ed-0524-4a7d-a465-68e95557bd76 commented 10 years ago

Commit: 7ad579b

46c4a3ed-0524-4a7d-a465-68e95557bd76 commented 10 years ago

comment:2

The push implements the push_relabel method, which has an interface much like the flow and _ford_fulkerson methods, except that it has 2 extra optional arguments:

use_global_relabeling: Whether or not to use the global relabeling heuristic. Though theoretically very nice, as well as in practice, as stated in some papers, it creates a huge overhead by needing to initialize the Residual Graph. Not using it gave in my own testcases a generally better result, meaning much faster.
use_gap_relabeling: A nice heuristic with little overhead. It checks for gaps in the labels that are used by the vertices. If label D is not used, then any vertex with label d for D < d < n can never become active again, so it just relabels them to n then.

For those interested: This was my testsuite used:

def test_small_graphs(rangeList = xrange(2, 5)):
    use_labels = False
    value = True
    total = 0
    set_random_seed(5)

    def test_graph(A):
        count = 0
    for s in xrange(0, n-1):
            for t in xrange(0, n-1):
                count += 1
                if s != t and A.shortest_path(s, t) != []:
                    f1, G1 = A._ford_fulkerson(s, t, value_only=value, use_edge_labels= use_labels)
                    f2, G2 = A.push_relabel(s, t, value_only=value, use_edge_labels = use_labels)
                    if abs(f1-f2) > 10**(-14)*f1:
                        print 'Unequal Flows in graph; Ford-Fulkerson:', f1, ' - Push-Relabel:', f2
                        return False, A, s, t
        return True, None, None, count

    for n in rangeList:
        G = graphs.CompleteGraph(n)
        E = G.edges()

        for EP in powerset(E):
            A = Graph(n, loops=False)
            A.add_edges(EP)
            if use_labels:
                for e in A.edge_iterator():
                    A.set_edge_label(e[0],e[1], 1+random()*99)
            good, A, s, t = test_graph(A)
            if good:
                total += t
            else:
                return False, A, s, t

        E = G.to_directed().edges()
        for EP in powerset(E):
            A = DiGraph(n, loops=False)
            A.add_edges(EP)
            if use_labels:
                for e in A.edge_iterator():
                    A.set_edge_label(e[0],e[1], 1+random()*99)
            good, A, s, t = test_graph(A)
            if good:
                total += t
            else:
                return False, A, s, t
        print 'Done with', n
    print 'Done, total number of instances done:', total
    return True, None, None, None

def test_random_graphs(size, probability, count):
    for i in xrange(count):
        G = digraphs.RandomDirectedGNP(size, probability)
        if G.shortest_path(0, size-1) != []:
            f1 = G._ford_fulkerson(0, size-1, use_edge_labels=True, value_only=True)
            f2 = G.push_relabel(0, size-1, use_edge_labels=True, value_only=True)
            if abs(f1-f2) > 10**(-14)*f1:
                print 'Unequal Flows in graph; Ford-Fulkerson:', f1, ' - Push-Relabel:', f2, ' - Absolute Error:', abs(f1-f2), ' - Relative Error:', abs(f1-f2)/f1

def test(cmd, sort, num):
    import cProfile, pstats, StringIO
    pr = cProfile.Profile()
    pr.run(cmd)
    ps = pstats.Stats(pr).strip_dirs().sort_stats(sort).print_stats(num)

with the collowing calling code:

test("good, A, s, t = test_small_graphs([2, 3, 4])", 'cumulative', int(10))

# NOTE: This test can run extremely long!!
#test("good, A, s, t = test_small_graphs([5])", 'cumulative', int(10))

# NOTE: This test can also run extremely long when all values (now commented) are used.
for size in [10, 20, 50]: #, 100, 200, 500, 1000, 2000]:
    for probability in [0.2, 0.4, 0.6, 0.8]:
        count = 10000/size
        test("test_random_graphs(" + str(size) + "," + str(probability) + "," + str(count) + ")", 'cumulative', int(10))

The testsuite could sure use some revamping, as for instance isomorphism is now neglected, meaning alot of double work, but that is not relevant for its results.

There are a few questions I have for people out there:

Currently, there are quite some subroutines to push_relabel, such as push, relabel, discharge, etc. I am not sure what is customary in Sage, but since they are property of push_relabel, so to speak, I put them inside the push_relabel code, rather than outside.
As already stated in the ticket, this depends on two other tickets I have submitted. Commits I made there are used in this code. I do have some experience with git, but not with Trac, and therefor I had no idea how to properly make sure that I could use them on my local computer, but that they would not be committed to this ticket. I am sorry if this push created duplicate code of the commits on the other tickets. Can anyone explain to me how to properly do this? For now, I called

sage: dev.checkout(16471)
sage: dev.pull(16467)
sage: dev.pull(16470)

Is this correct, or is there perhaps a cleaner way to do this?

With kind regards,

Florian Oosterhof

New commits:

`cf35184`	`Added push_relabel and defaulted all methods that used "FF" to "PR"`
`60c0c0f`	`Added insert and remove methods, as well as redid internal storage to become actual Doubly Linked List`
`50a8248`	`Merge branch 'u/foosterhof/ticket/16467' of git://trac.sagemath.org/sage into ticket/16471`
`c0dd8ff`	`Added optional argument (default False) to report distance along with vertex`
`c2915be`	`Merge branch 'u/foosterhof/ticket/16470' of git://trac.sagemath.org/sage into ticket/16471`
`ea9e03f`	`Fixed small errors`
`f8a86c0`	`Added is_empty method`
`ee43e8f`	`Merge branch 'u/foosterhof/ticket/16467' of git://trac.sagemath.org/sage into ticket/16471`
`7ad579b`	`Fixed small error in usage of DoublyLinkedList`

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f3f29a1 Added import for DiGraph

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 7ad579b to f3f29a1

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 10 years ago

comment:6

Hello !!

Please provide timings if you want to make your function the default one. Also, could you rename it to _push_relabel, as it is the case already for Ford-Fulkerson ?

Nathann

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

900e255 Renamed to _push_relabel and fixed flow graph generation

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from f3f29a1 to 900e255

46c4a3ed-0524-4a7d-a465-68e95557bd76 commented 10 years ago

comment:8

Profiling Code:

def test_small_graphs(rangeList = xrange(2, 5)):
    use_labels = False
    value = True
    total = 0
    set_random_seed(5)

    def test_graph(A):
        count = 0
    for s in xrange(0, n-1):
            for t in xrange(0, n-1):
                count += 1
                if s != t and A.shortest_path(s, t) != []:
                    f1 = A._ford_fulkerson(s, t, value_only=value, use_edge_labels= use_labels)
                    f2 = A._push_relabel(s, t, value_only=value, use_edge_labels = use_labels)
                    if abs(f1-f2) > 10**(-14)*f1:
                        print 'Unequal Flows in graph; Ford-Fulkerson:', f1, ' - Push-Relabel:', f2, ' - Absolute Error:', abs(f1-f2), ' - Relative Error:', abs(f1-f2)/f1
                        return False, A, s, t
        return True, None, None, count

    for n in rangeList:
        for A in graphs(n):
            if use_labels:
                for e in A.edge_iterator():
                    A.set_edge_label(e[0],e[1], 1+random()*99)
            good, A, s, t = test_graph(A)
            if good:
                total += t
            else:
                return False, A, s, t

        for A in digraphs(n):
            if use_labels:
                for e in A.edge_iterator():
                    A.set_edge_label(e[0],e[1], 1+random()*99)
            good, A, s, t = test_graph(A)
            if good:
                total += t
            else:
                return False, A, s, t
        print 'Done with', n
    print 'Done, total number of instances done:', total
    return True, None, None, None

def test_random_graphs(size, probability, count):
    for i in xrange(count):
        G = digraphs.RandomDirectedGNP(size, probability)
        if G.shortest_path(0, size-1) != []:
            f1, F1 = G._ford_fulkerson(0, size-1, use_edge_labels=True, value_only=False)
            f2, F2 = G._push_relabel(0, size-1, use_edge_labels=True, value_only=False)
            if abs(f1-f2) > 10**(-14)*f1:
                print 'Unequal Flows in graph; Ford-Fulkerson:', f1, ' - Push-Relabel:', f2, ' - Absolute Error:', abs(f1-f2), ' - Relative Error:', abs(f1-f2)/f1

def test(cmd, sort, num):
    import cProfile, pstats, StringIO
    pr = cProfile.Profile()
    pr.run(cmd)
    ps = pstats.Stats(pr).strip_dirs().sort_stats(sort).print_stats(num)

Calling Code:

test("test_small_graphs([2, 3, 4])", 'cumulative', int(100))
test("test_small_graphs([5])", 'cumulative', int(15))

for size in [10, 20, 50, 100, 200, 500, 1000, 2000]:
    for probability in [0.2, 0.4, 0.6, 0.8]:
        count = 10000/size
        test("test_random_graphs(" + str(size) + "," + str(probability) + "," + str(count) + ")", 'cumulative', int(8))

Profiling Data:

First column after data represents the relative difference.

The second column represents the relative speedup.

In general, I think I can safely say that this method is faster, except for smaller graphs perhaps, though it seems to be at most two times as slow.

Those 2 questions from before still stand:

Is the way I nested those functions such as discharge, relabel and push good practice?
Is the way I merged the dependency on the other tickets correct?

Florian

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Changed commit from 900e255 to 01276c0

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 10 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

01276c0 Fixed error when using use_global_relabeling = True

6bdad4c1-1e26-4f2f-a442-a01a2292c181 commented 9 years ago

comment:11

This ticket need to be rebased, and in the meantime we have a new flow function through igraph. Set to needs_work

sagemath / sage

The Push-Relabel method for Maximum Flow #16471