Closed a8f3419b-6383-42be-b93c-ecaa87929754 closed 9 years ago
Dependencies: #18929
Description changed:
---
+++
@@ -1 +1 @@
-
+Include the igraph algorithm for max flow / min cut.
Author: Michele Borassi
Changed keywords from none to igraph, maximum flow, minimum cut
Branch: u/borassi/igraph_max_flow
Commit: f2f97d5
Hellooooo!
I have implemented max flow/min cut through igraph, which is faster than the old Python-based implmentation of Ford-Fulkerson algorithm. I hope you like it!
Michele
The usual benchmarks:
sage: g = graphs.RandomGNM(2000,20000)
sage: for u,v in g.edges(labels=False):
....: g.set_edge_label(u,v,random())
....:
sage: g.flow(0,1,method='LP')
8.742625820571515
sage: g.flow(0,1,method='FF')
8.742625820571511
sage: g.flow(0,1,method='igraph')
8.742625820571515
sage: %timeit g.flow(0,1, method='FF')
1 loops, best of 3: 256 ms per loop
sage: %timeit g.flow(0,1, method='LP')
1 loops, best of 3: 1.8 s per loop
sage: %timeit g.flow(0,1, method='igraph')
10 loops, best of 3: 66.8 ms per loop
sage: g = digraphs.RandomDirectedGNM(5000,60000)
sage: for u,v in g.edges(labels=False):
g.set_edge_label(u,v,random())
....:
sage: g.flow(0,1, method='FF')
4.122947644386753
sage: g.flow(0,1, method='LP')
4.122947644386753
sage: g.flow(0,1, method='igraph')
4.122947644386753
sage: %timeit g.flow(0,1, method='FF')
1 loops, best of 3: 404 ms per loop
sage: %timeit g.flow(0,1, method='LP')
1 loops, best of 3: 2.86 s per loop
sage: %timeit g.flow(0,1, method='igraph')
1 loops, best of 3: 226 ms per loop
sage: g.flow(0,1,use_edge_labels=False, method='FF')
10
sage: g.flow(0,1,use_edge_labels=False, method='LP')
10.0
sage: g.flow(0,1,use_edge_labels=False, method='igraph')
10.0
sage: %timeit g.flow(0,1,use_edge_labels=False, method='FF')
1 loops, best of 3: 324 ms per loop
sage: %timeit g.flow(0,1,use_edge_labels=False, method='LP')
1 loops, best of 3: 2.49 s per loop
sage: %timeit g.flow(0,1,use_edge_labels=False, method='igraph')
10 loops, best of 3: 42.6 ms per loop
In conclusion, igraph is always the fastest, and the improvement is maximum when all edges have capacity 1.
Helloooooo Michele,
Thank you very much for those speedups :-P
Here are the results of my review:
They run a max flow algorithm and silently ignore the weight if they do not
understand it? Crazy O_o
Sometimes, Sage integer/floats are not compatible with igraph::
P.S.: I just noticed that we do exactly the same: capacity=lambda x: x if x in RR else 1
. This is out of the scope of this ticket, but what would you
think of getting rid of this 'default' and use 'x' at all times (which would
raise an exception when the weights are obviously wrong)?
"None"
? Or None
?
if method in ["FF", "igraph", "None"]:
The code imports 'igraph' even when not necessary, i.e. when method=None
. I
tried to figure out how to rewrite it better, but many patterns lead me to
write the import/except
block twice, which is a bit unpleasant. Instead of
importing igraph
in order to guess if it is available, what would you think
of using is_package_installed("python_igraph")
? This way, you do not have to
define igraph_available
at all, and you can use the output of
is_package_installed
to decide whether method should be set to
igraph
. Then, in the elif (method == 'igraph'):
block, you can actually
import igraph and raise the exception otherwise. What do you think?
Is there a reason why you import floor
from sage.functions.other
and not
from math
(i.e. from python)? Usually 'our' functions can handle symbolics
and are much slower as a result.
Instead of differentiating between 'if integer' and 'else' in the igraph code, wouldn't it be easier to replace the weight function (in all cases) by 'floor'?
It worries me everytime I see a line like
g_igraph = self.igraph_graph(edge_attrs={'capacity':[float(capacity(e[2])) for e in self.edge_iterator()]})
I always think: are we even sure that the right
weight will be associated with the right edge? O_o
Wow. That's even scarier:
for e in self.edge_iterator(): f =next(igraph_flow)
Could you iterate on the igraph graph's edges instead?
If the order of edges changes in their data structure, we would output
gibberish, won't we? O_o
By the way, you may be interested by zip
and izip
:
sage: a=[1,2,3]
sage: b=[4,5,6]
sage: a=iter([1,2,3])
sage: b=iter([4,5,6])
sage: from itertools import izip
sage: for x,y in izip(a,b):
....: print x,y
1 4
2 5
3 6
I did not do any profiling so I may be wrong, but I expect that you are making
your life a bit too difficult, and the code a bit slower, by working with
those Sage edges. It seems that you "mix" Sage's edge objects with igraph flow
values in order to build a labelled graph while igraph is
integer-labelled. In particular, in the code which builds the flow graph of an
undirected graph, you relabel all the edges on the fly. As you use a
dictionary, this means 2m
dict queries, which can take some time (probably
negligible compared to everything else, of course). Instead, you could work to
build an integer Sage graph (which would simplify the relabelling) and
relabel it (using the relabel function) once that it is built. Relabelling a
graph is much faster than doing 2m
dict queries. Theoretically, it only
needs to query each value once, i.e. n
queries.
Sorry for being rather absent these days. The "main" excuse is that I am on
vacation (worse, on vacations with people around!), but the truth is that I
spend all the time I have alone trying to build in Sage all strongly regular
graphs promised by Brouwer's database
http://www.win.tue.nl/~aeb/graphs/srg/srgtab.html. Which requires a LOT of
digging work in old books ;-)
Good luuuck !
Nathann
Helloooooo!
Thank you very much for your review! The solutions/answers follow!
See you,
Michele
Replying to @nathanncohen:
Helloooooo Michele,
Thank you very much for those speedups
:-P
Here are the results of my review:
- They run a max flow algorithm and silently ignore the weight if they do not understand it? Crazy
O_o
Sometimes, Sage integer/floats are not compatible with igraph::
P.S.: I just noticed that we do exactly the same:
capacity=lambda x: x if x in RR else 1
. This is out of the scope of this ticket, but what would you think of getting rid of this 'default' and use 'x' at all times (which would raise an exception when the weights are obviously wrong)?
I agree: I have replaced that line with capacity=lambda x: x. However, this way, some tests were wrong, because they provided a graph with no edge label, implying that the capacity would be 1 for each edge. To solve this issue, I decided to set use_edge_labels
to False
by default (as it is already done in the min-cut algorithm).
"None"
? OrNone
?if method in ["FF", "igraph", "None"]:
Done!
- The code imports 'igraph' even when not necessary, i.e. when
method=None
. I tried to figure out how to rewrite it better, but many patterns lead me to write theimport/except
block twice, which is a bit unpleasant. Instead of importingigraph
in order to guess if it is available, what would you think of usingis_package_installed("python_igraph")
? This way, you do not have to defineigraph_available
at all, and you can use the output ofis_package_installed
to decide whether method should be set toigraph
. Then, in theelif (method == 'igraph'):
block, you can actually import igraph and raise the exception otherwise. What do you think?
I like it! I was also looking for a better solution, but I didn't know the routine is_package_installed
.
- Is there a reason why you import
floor
fromsage.functions.other
and not frommath
(i.e. from python)? Usually 'our' functions can handle symbolics and are much slower as a result.
Yes, there is: I didn't know the function floor :-P... Corrected!
- Instead of differentiating between 'if integer' and 'else' in the igraph code, wouldn't it be easier to replace the weight function (in all cases) by 'floor'?
Hmmm, could you explain a bit more what you mean? In my opininon, the user should decide whether (s)he wants integer-valued flow or real-valued flow. For instance:
sage: g = Graph([(0,1,0.5)])
sage: g.flow(0,1,use_edge_labels=True)
0.5
sage: g.flow(0,1,use_edge_labels=True,integer=True)
0.0
sage: g = Graph([(0,1,0.5),(0,2,1.5),(1,2,1.5)])
sage: g.flow(0,1,use_edge_labels=True)
2.0
sage: g.flow(0,1,use_edge_labels=True,integer=True)
1.0
- It worries me everytime I see a line like
g_igraph = self.igraph_graph(edge_attrs={'capacity':[float(capacity(e[2])) for e in self.edge_iterator()]})
I always think: are we even sure that the right weight will be associated with the right edge?O_o
In this case, yes: the edges are added in order, and they are assigned an integer, as "explained" by the igraph tutorial [1]. I tried to stick as much as possible with the examples in the tutorial, but if you have a better idea, please let me know! To me, it seems that this is the standard in igraph.
- Wow. That's even scarier:
for e in self.edge_iterator(): f =next(igraph_flow)
Could you iterate on the igraph graph's edges instead? If the order of edges changes in their data structure, we would output gibberish, won't we?O_o
By the way, you may be interested byzip
andizip
:sage: a=[1,2,3] sage: b=[4,5,6] sage: a=iter([1,2,3]) sage: b=iter([4,5,6]) sage: from itertools import izip sage: for x,y in izip(a,b): ....: print x,y 1 4 2 5 3 6
- I did not do any profiling so I may be wrong, but I expect that you are making your life a bit too difficult, and the code a bit slower, by working with those Sage edges. It seems that you "mix" Sage's edge objects with igraph flow values in order to build a labelled graph while igraph is integer-labelled. In particular, in the code which builds the flow graph of an undirected graph, you relabel all the edges on the fly. As you use a dictionary, this means
2m
dict queries, which can take some time (probably negligible compared to everything else, of course). Instead, you could work to build an integer Sage graph (which would simplify the relabelling) and relabel it (using the relabel function) once that it is built. Relabelling a graph is much faster than doing2m
dict queries. Theoretically, it only needs to query each value once, i.e.n
queries.
I have added an attribute 'name' to igraph vertices, and now I iterate on igraph edges, using the attribute when I create the new graph. Do you like it as it is?
[1] !https://python-igraph.readthedocs.org/en/latest/tutorial.html
Branch pushed to git repo; I updated commit sha1. New commits:
6282386 | Reviewer's suggestions |
Hello,
P.S.: I just noticed that we do exactly the same:
capacity=lambda x: x if x in RR else 1
. This is out of the scope of this ticket, but what would you think of getting rid of this 'default' and use 'x' at all times (which would raise an exception when the weights are obviously wrong)?I agree: I have replaced that line with capacity=lambda x: x. However, this way, some tests were wrong, because they provided a graph with no edge label, implying that the capacity would be 1 for each edge. To solve this issue, I decided to set
use_edge_labels
toFalse
by default (as it is already done in the min-cut algorithm).
No, please. This is out of the scope of this ticket. This should not be done here. There are many functions that handle labels this way and we should not only change the behaviour of one function (i.e. flow), plus if the default behaviour changes we have to check all functions that call flow, and perhaps think twice whether this should be done. Perhaps only use '1' if the label is None
? But please, let us not touch it in this ticket.
- Instead of differentiating between 'if integer' and 'else' in the igraph code, wouldn't it be easier to replace the weight function (in all cases) by 'floor'?
Hmmm, could you explain a bit more what you mean? In my opininon, the user should decide whether (s)he wants integer-valued flow or real-valued flow.
Sorry, my sentence was highly confusing. When I said 'in all cases', I meant "for all values of 'method' so that there is no need for this block of code
if integer:
g_igraph = self.igraph_graph(vertex_attrs={'name':self.vertices()}, edge_attrs={'capacity':[int(floor(capacity(e[2]))) for e in self.edge_iterator()]})
else:
g_igraph = self.igraph_graph(vertex_attrs={'name':self.vertices()}, edge_attrs={'capacity':[float(capacity(e[2])) for e in self.edge_iterator()]})
In this case, yes: the edges are added in order, and they are assigned an integer, as "explained" by the igraph tutorial [1].
Crazy. I don't even want to know how it behaves when you add/remove edges, and the time complexity of it.
I have added an attribute 'name' to igraph vertices, and now I iterate on igraph edges, using the attribute when I create the new graph. Do you like it as it is?
It is not exactly what I had in mind, but do not worry about that: went the other points will be fixed I will add a commit, and we will either take it if you like it or keep with your code if you do not.
Nathann
Hello!
P.S.: I just noticed that we do exactly the same:
capacity=lambda x: x if x in RR else 1
. This is out of the scope of this ticket, but what would you think of getting rid of this 'default' and use 'x' at all times (which would raise an exception when the weights are obviously wrong)?I agree: I have replaced that line with capacity=lambda x: x. However, this way, some tests were wrong, because they provided a graph with no edge label, implying that the capacity would be 1 for each edge. To solve this issue, I decided to set
use_edge_labels
toFalse
by default (as it is already done in the min-cut algorithm).No, please. This is out of the scope of this ticket. This should not be done here. There are many functions that handle labels this way and we should not only change the behaviour of one function (i.e. flow), plus if the default behaviour changes we have to check all functions that call flow, and perhaps think twice whether this should be done. Perhaps only use '1' if the label is
None
? But please, let us not touch it in this ticket.
Oh, sorry, actually you wrote that it was out of the scope of this ticket. I reverted everything.
- Instead of differentiating between 'if integer' and 'else' in the igraph code, wouldn't it be easier to replace the weight function (in all cases) by 'floor'?
Hmmm, could you explain a bit more what you mean? In my opininon, the user should decide whether (s)he wants integer-valued flow or real-valued flow.
Sorry, my sentence was highly confusing. When I said 'in all cases', I meant "for all values of 'method' so that there is no need for this block of code
if integer: g_igraph = self.igraph_graph(vertex_attrs={'name':self.vertices()}, edge_attrs={'capacity':[int(floor(capacity(e[2]))) for e in self.edge_iterator()]}) else: g_igraph = self.igraph_graph(vertex_attrs={'name':self.vertices()}, edge_attrs={'capacity':[float(capacity(e[2])) for e in self.edge_iterator()]})
Done!
In this case, yes: the edges are added in order, and they are assigned an integer, as "explained" by the igraph tutorial [1].
Crazy. I don't even want to know how it behaves when you add/remove edges, and the time complexity of it.
I have added an attribute 'name' to igraph vertices, and now I iterate on igraph edges, using the attribute when I create the new graph. Do you like it as it is?
It is not exactly what I had in mind, but do not worry about that: went the other points will be fixed I will add a commit, and we will either take it if you like it or keep with your code if you do not.
Great! I will wait for your code, then!
Branch pushed to git repo; I updated commit sha1. New commits:
d363158 | Reviewer's commit |
Oh, sorry, actually you wrote that it was out of the scope of this ticket. I reverted everything.
Thanks. I know that it is unpleasant to do, sorry :-/
Done!
Thanks.
Great! I will wait for your code, then!
I just pushed a commit at public/19003. It does not do much, but I hope that it simplifies the code "a bit". Don't hesitate to tell me if you do not like it. In theory it should also "do less computations", though it probably cannot be measured.
Nathann
Branch pushed to git repo; I updated commit sha1. New commits:
68053d3 | trac #19003: build an integer-labelled graph, then relabel it |
Branch pushed to git repo; I updated commit sha1. New commits:
0fc03ce | Broken link in documentation |
Thank you Nathann for your work. I have included it, and I have corrected a small mistake in the documentation. All long tests on graphs are successful, documentation is ok, so I think this patch is good to go!
Michele
Arg. Thanks for the fix ^^;
Reviewer: Nathann Cohen
Hellooooooo,
Thank you Nathann for your work. I have included it, and I have corrected a small mistake in the documentation.
Yepyepyep, thank you for that.
All long tests on graphs are successful, documentation is ok, so I think this patch is good to go!
Same here. Good to go :-)
Nathann
Changed branch from u/borassi/igraph_max_flow to 0fc03ce
Include the igraph algorithm for max flow / min cut.
Depends on #18929
CC: @nathanncohen @dcoudert
Component: graph theory
Keywords: igraph, maximum flow, minimum cut
Author: Michele Borassi
Branch/Commit:
0fc03ce
Reviewer: Nathann Cohen
Issue created by migration from https://trac.sagemath.org/ticket/19003