Closed dcoudert closed 9 years ago
Author: David Coudert
Description changed:
---
+++
@@ -1 +1,3 @@
-
+This patch implements:
+- 2sweep and 4sweep lower bounds on the diameter of a graph
+- iFUB, an exact and fast algorithm for computing the diameter of a graph
Branch: u/dcoudert/help
Description changed:
---
+++
@@ -1,3 +1,53 @@
This patch implements:
- 2sweep and 4sweep lower bounds on the diameter of a graph
- iFUB, an exact and fast algorithm for computing the diameter of a graph
+
+```
+sage: G = graphs.RandomBarabasiAlbert(1000,2)
+sage: %time G.diameter('standard')
+CPU times: user 44 ms, sys: 0 ns, total: 44 ms
+Wall time: 42.7 ms
+7
+sage: %time G.diameter('iFUB')
+CPU times: user 24 ms, sys: 1 ms, total: 25 ms
+Wall time: 23.6 ms
+7
+
+sage: G = graphs.RandomBarabasiAlbert(10000,2)
+sage: %time G.diameter('standard')
+CPU times: user 3.49 s, sys: 1 ms, total: 3.49 s
+Wall time: 3.49 s
+9
+sage: %time G.diameter('iFUB')
+CPU times: user 336 ms, sys: 2 ms, total: 338 ms
+Wall time: 333 ms
+9
+```
+
+The running time of the iFUB method depends on many parameters but is in general faster than previous methods.
+
+```
+sage: G = graphs.Grid2dGraph(50,50)
+sage: %time G.diameter('standard')
+CPU times: user 138 ms, sys: 2 ms, total: 140 ms
+Wall time: 139 ms
+98
+sage: %time G.diameter('iFUB')
+CPU times: user 80 ms, sys: 2 ms, total: 82 ms
+Wall time: 80.5 ms
+98
+sage: V = G.vertices()
+sage: shuffle(V)
+sage: G.relabel(inplace=True, perm=V)
+sage: %time G.diameter('iFUB')
+CPU times: user 35 ms, sys: 1 ms, total: 36 ms
+Wall time: 33.4 ms
+98
+sage: shuffle(V)
+sage: G.relabel(inplace=True, perm=V)
+sage: %time G.diameter('iFUB')
+CPU times: user 35 ms, sys: 1 ms, total: 36 ms
+Wall time: 34.2 ms
+98
+```
+
Yo !!
A second pass on this code. With this I guess that the 'technical review' is done, but I haven't checked the correction yet, i.e. looked at the papers and what they prove. I don't really like to make a non-obvious algorithm the default one when we have a much simpler way to compute the same value, but well, if it is faster ...
seen
. "It must be initialized, but there is no need to initialize it".+ - ``seen`` -- bitset of size ``n`` that must be initialized before calling
+ this method. owever, there is no nead to initialize it.
this method check is all
--> checkS iF all
this helps saving
--> save
when we already known
--> know
About your check for connected graphs: I find your if test_is_connected and len(seen)<n: return big_number
a bit weird given that you just ran a BFS which is supposed to return the eccentricity of a vertex (which, in particular, tests if the graph is connected}}}. And actually, while you say that the simple_BFS
function returns the eccentricity of source
, it only does so when the graph is connected. If it is not, it returns the (finite) eccentricity of source
in its connected component. Is there any reason why you should not do that simple_BFS
?
-return distances[waiting_list[waiting_end]]
+return distances[waiting_list[waiting_end]] if waiting_end == waiting_list+n-1 else UINT32_MAX
It would make the simple_BFS
function more correct, and it also means that you do not need to define this is_connected
parameter as ... checking this is totally free !
This method search for
-> searches
diameter_lower_bound_4sweep_select_dichotomy
: really the dichotomy should be done through bsearch (http://www.cplusplus.com/reference/cstdlib/bsearch/) but I don't exactly know how to provide distances
as an argument. Too bad :-/
By the way, do you really need the if i==j-1
block ? Or would it be sufficient to just replace i=k
with i=k+1
?
In lower_bound_4sweep
: you do not need a improved
variable. Use a break
with a while True
. And you can even have a while tmp>LB
and no break
if you compute tmp
at the end of the loop :-P
Is it only me or is the 4sweep
function doing much more that 4 sweeps ?
diameter_iFUB
frees memory that it did not allocate itself. This feels wrong O_o
I do not think that the comments about the data structure in diameter()
are necessary. All that this function does is call others.
There are useless variables in this function
Nathann
Branch pushed to git repo; I updated commit sha1. New commits:
6eb88ea | multiple corrections and simplifications |
Replying to @nathanncohen:
Yo !!
A second pass on this code. With this I guess that the 'technical review' is done, but I haven't checked the correction yet, i.e. looked at the papers and what they prove. I don't really like to make a non-obvious algorithm the default one when we have a much simpler way to compute the same value, but well, if it is faster ...
- About
seen
. "It must be initialized, but there is no need to initialize it".+ - ``seen`` -- bitset of size ``n`` that must be initialized before calling + this method. owever, there is no nead to initialize it.
I have revised the text as follows:
+ - ``seen`` -- bitset of size ``n`` that must be initialized before calling
this method (i.e., bitset_init(seen, n)). However, there is no need to
clear it.
this method check is all
--> checkS iF all
this helps saving
--> save
when we already known
--> knowAbout your check for connected graphs: I find your
if test_is_connected and len(seen)<n: return big_number
a bit weird given that you just ran a BFS which is supposed to return the eccentricity of a vertex (which, in particular, tests if the graph is connected}}}. And actually, while you say that thesimple_BFS
function returns the eccentricity ofsource
, it only does so when the graph is connected. If it is not, it returns the (finite) eccentricity ofsource
in its connected component. Is there any reason why you should not do thatsimple_BFS
?-return distances[waiting_list[waiting_end]] +return distances[waiting_list[waiting_end]] if waiting_end == waiting_list+n-1 else UINT32_MAX
It would make the
simple_BFS
function more correct, and it also means that you do not need to define thisis_connected
parameter as ... checking this is totally free !
You are perfectly right! I have changed the code. I have removed the useless parameter and now return the eccentricity of the source in the graph:
return distances[waiting_list[waiting_end]] if waiting_end==n-1 else UINT32_MAX
This method search for
-> searches
ok
diameter_lower_bound_4sweep_select_dichotomy
: really the dichotomy should be done through bsearch (http://www.cplusplus.com/reference/cstdlib/bsearch/) but I don't exactly know how to providedistances
as an argument. Too bad:-/
By the way, do you really need the if
i==j-1
block ? Or would it be sufficient to just replacei=k
withi=k+1
?
- In
lower_bound_4sweep
: you do not need aimproved
variable. Use abreak
with awhile True
. And you can even have awhile tmp>LB
and nobreak
if you computetmp
at the end of the loop:-P
I agree that calling bsearch
would be better, but I don't know how to do it without introducing extra data structure (e.g., array of pairs (vertex, distance)). I have simplified the code and removed the diameter_lower_bound_4sweep_select_dichotomy
method.
- Is it only me or is the
4sweep
function doing much more that 4 sweeps ?
You are right. I don't know why they call it 4sweep
instead of multisweep
. I can change the name if you think it is more natural.
diameter_iFUB
frees memory that it did not allocate itself. This feels wrongO_o
I have removed these instructions.
I do not think that the comments about the data structure in
diameter()
are necessary. All that this function does is call others.There are useless variables in this function
I have shortened the comments and remove useless variables. It should be cleaner now.
Nathann
Thanks a lot for all the comments.
New commits:
6eb88ea | multiple corrections and simplifications |
You say in the documentation of the 4sweep method that the vertex w
should be at the same distance from u
and v
, but your dichotomy search does not do that. You only check that the vertex you take is at distance LB/2 from u
, but you do not know anything about v
. For all you know w
may be a leaf.
Nathann
Branch pushed to git repo; I updated commit sha1. New commits:
7b6bfd2 | #17135: use correct method for choosing vertex m in 4sweep |
You are perfectly right.
I found no other way than storing the predecessors and then following the path from a destination to the source to find that vertex. Let me know if you have a smarter approach.
Thanks.
Yo !
I found no other way than storing the predecessors and then following the path from a destination to the source to find that vertex. Let me know if you have a smarter approach.
No I don't see anything better :-/
Too bad for the fast search.
Nathann
Since most of the computation time is in iFUB, this is not so bad.
Branch pushed to git repo; I updated commit sha1. New commits:
299c323 | rename 4sweep into multi_sweep |
Hello again !
Okay, I finally understood how this thing works and it is quite simple and clear. Cool !
Aaaaaand so I added a paragraph of doc that explains how the algorithm works.
I also replace a couple of 4sweeps->multisweep, and remove the references from GenericGraph.diameter
as Sphinx complained of having them twice as they are in distances_all_pairs
too.
Now, while I updated the doc of distances_all_pairs.diameter
I did not update the doc of GenericGraph.diameter
because I have a question: why on earth do we define GenericGraph.diameter
instead of just importing into GenericGraph
the function from distances_all_pairs
? All the documentation is mostly a copy/paste !?
Also in multi-sweep/diameter_iFUB: the documentation says that the int *
store the result of a BFS from source
, while actually not much can be claimed about their content when the function returns. Just say that it is memory space used by the function while it runs unless you can say something more ?..
I added a commit with my changes at public/17135, tell me what you think !
Nathann
Hello,
thanks for the improvements. It is much better now.
Following your comments, I'm now importing the new method as GenericGraph.diameter
.
I have also removed some int *
parameters from iFUB and multi-sweep. It avoids confusion. Also, since we only need to allocate tables (no need to initialize them), the extra cost of allocating some tables in 2 different methods is negligible.
David.
PS: git is powerful but not intuitive/easy for non experts.
Hello !
Following your comments, I'm now importing the new method as
GenericGraph.diameter
. I have also removed someint *
parameters from iFUB and multi-sweep. It avoids confusion. Also, since we only need to allocate tables (no need to initialize them), the extra cost of allocating some tables in 2 different methods is negligible.
Well.
1) Looks ready to go 2) Good job 3) Thanks for this patch !
PS: git is powerful but not intuitive/easy for non experts.
Yep. Contributing to Sage is really hard at first :-/
Nathann
Reviewer: Nathann Cohen
Thank you Nathann.
Changed branch from u/dcoudert/help to 69cb552
This patch implements:
The running time of the iFUB method depends on many parameters but is in general faster than previous methods.
CC: @nathanncohen
Component: graph theory
Author: David Coudert
Branch/Commit:
69cb552
Reviewer: Nathann Cohen
Issue created by migration from https://trac.sagemath.org/ticket/17135