Closed a8f3419b-6383-42be-b93c-ecaa87929754 closed 9 years ago
Author: Michele Borassi
Changed keywords from none to Closeness centrality
Description changed:
---
+++
@@ -1 +1,9 @@
+Now, the closeness centrality algorithm simply calls the corresponding NetworkX routine (only for undirected graphs).
+This ticket will move this routine to file generic_graph (closeness centrality is defined also in this case ![1]), and it will modify input/output variables as in routine shortest_path. Then, it will call our shortest path new routines instead of NetworkX routines.
+
+This ticket is a first step towards the inclusion of the algorithm in ![2].
+
+![1] !http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6816651&tag=1
+
+![2] !http://arxiv.org/pdf/1507.01490v1.pdf
Dependencies: #18931, #18876, #18910
Hello!
This is the first version of the refactoring of closeness_centrality. Let me know if you like it!
Commit: 5c85793
I don't really understand what you are doing in this commit https://github.com/sagemath/sagetrac-mirror/commits/5c85793ccb8234ac668173620523f45dbae4b985
In the ticket description you say that you will move everything in generic_graph
, but it might be more readable/clean to create a new file for centrality measures, no?
Also, you moved only references to the methods, right? (or the modifications are hidden in the merge with #18931).
+ - :meth:`~sage.graphs.graph.Graph.centrality_degree`
+ - :meth:`~centrality_betweenness`
David.
Branch pushed to git repo; I updated commit sha1. New commits:
5853eb7 | Merged with beta2 |
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
a19addb | Rewrote closeness centrality according to new standards |
Hello!
Thank you for the comment, I will try to address the main issues raised.
I have put everything in a single commit, which simply removes the closeness centrality routine from graph.py
and rewrites it in generic_graph.py
. Now the ticket should be much easier to review. Since closeness centrality highly depends on shortest paths, the new routine follows all conventions in ticket #18931.
About the creation of a new file: we already have file centrality.pyx
, containing fast algorithms for betweenness centrality. However, these routines are called from a centrality_betweenness
routine in file generic_graph.py
, which should be used by the user. In this ticket, I just copied this policy.
In my opinion, if we want to create a new file, we should at least move also betweenness centrality and degree centrality: in my opinion, this goes beyond the scope of this ticket. Do you agree?
Best,
Michele
Replying to @dcoudert:
I don't really understand what you are doing in this commit https://github.com/sagemath/sagetrac-mirror/commits/5c85793ccb8234ac668173620523f45dbae4b985
In the ticket description you say that you will move everything in
generic_graph
, but it might be more readable/clean to create a new file for centrality measures, no? Also, you moved only references to the methods, right? (or the modifications are hidden in the merge with #18931).+ - :meth:`~sage.graphs.graph.Graph.centrality_degree` + - :meth:`~centrality_betweenness`
David.
Something is wrong here
+ try:
+ v_iter = iter(vert)
+ except TypeError:
+ v_iter = [v_iter]
+ v_iter = iter(vert)
I assume you want first to have `v_vert = iter(list(vert))
The default of calling distances = self.shortest_path_all_pairs(by_weight,algorithm, weight_function, check_weight)[0]
is that you get a dictionary of dictionary, that is a huge data structure for large graphs (lost of memory and long to create). At least the second part of this method should be cythonized to get the array of distances, as is done for centrality_betweenness
.
David.
Something is wrong here
+ try: + v_iter = iter(vert) + except TypeError: + v_iter = [v_iter] + v_iter = iter(vert)
+1.
In some cases a vertex can be a pair, and thus is iterable. This would react badly with this code.
sage: g=graphs.KneserGraph(5,2)
sage: v=g.vertices()[0]
sage: g.centrality_closeness(v)
<crash>
Nathann
Right. My previous proposal is not a good solution.
Something like that might also be unsafe if some labels in vert
are not in G
, but this can be checked when trying to associate computed values and vertices.
v_vert = iter([vert]) if vert in G else iter(vert)
Branch pushed to git repo; I updated commit sha1. New commits:
7dfa2d2 | Corrected v_iter, cythonized Johnson |
Hello!
I hope you spent a great Assumption day!
With this new commit, (hopefully) I solved the problems with v_vert
and I cythonized the Johnson algorithm. In my opinion, it is not worth cythonizing Floyd-Warshall, because it is much slower than Johnson, especially on sparse graphs.
Thank you very much!
Michele
Hello,
I'm trying to understand the behavior of the method on digraphs. You wrote that the centrality of vertices of degree 0 is not reported. Does it means that for digraphs you don't report the centrality of vertices with out-degree 0? Also, what's the expected behavior on strongly connected digraphs?
David.
Branch pushed to git repo; I updated commit sha1. New commits:
7a4059a | Improved documentation |
Branch pushed to git repo; I updated commit sha1. New commits:
5c67f79 | Small mistake |
Hello!
I tried to improve the documentation, answering your questions.
Michele
Replying to @dcoudert:
You wrote that the centrality of vertices of degree 0 is not reported. Does it means that for digraphs you don't report the centrality of vertices with out-degree 0?
Yes, I changed in the documentation "degree" with "(out)degree"
Also, what's the expected behavior on strongly connected digraphs?
The closeness centrality of a vertex v is 1/farn(v), where farn(v) is the average distance between v and a random vertex w different from v (as in the case of connected graphs). I added an example about this.
Hello,
This is much better now. The method works well, passes all tests and the doc build properly.
One possible modification is to move the method to file centrality.pyx
which is made for gathering centrality methods. Let me know.
Best, David.
Branch pushed to git repo; I updated commit sha1. New commits:
48514ad | Moved centrality_closeness to centrality.pyx |
Done!
However, I had to do some "magic" to avoid duplicate references... Hope it is ok!
Branch pushed to git repo; I updated commit sha1. New commits:
cf9cf87 | Trailing whitespaces |
Yep, I add to make doc-clean && make doc
to get it compiled.
I just saw in the doc of centrality_degree
in graph.py
, in section See also, that the link to centrality_closeness
is broken. In fact, the code is :meth:`centrality_closeness`
. Could you check that?
In fact we should also move this methode to centrality.pyx
, but in another ticket.
After that I think it will be ok.
David.
Branch pushed to git repo; I updated commit sha1. New commits:
90638aa | Corrected link |
Done!
Hello,
do you understand why the patchbot reports some errors? I have no error when testing this patch on my computer (install, long tests, trials, docbuild html et pdf,...).
Best, David.
What is the patchbot? How can I find these errors?
on the top of this page, you can see a blue disk (or red or green, etc.) with a link to http://patchbot.sagemath.org/ticket/19007/ It reports on automatic tests of your patch. Tests are performed on all modules.
Branch pushed to git repo; I updated commit sha1. New commits:
73dd6d5 | Solve patchbot - first attempt |
Branch pushed to git repo; I updated commit sha1. New commits:
f1c7fd5 | Solve patchbot - second attempt |
Note that the patchbot needs a lot of time: it is not tested instantaneously and it takes some hours of computations...
Ok, I will wait until it answers... In my opinion, the problem is that we need to import centrality.pyx in generic_graph at startup, and this slows down the startup time. I tried to change the imports, hoping the total time will decrease. If any of the two attempts work, we are fine, otherwise I probably have to mail sage-devel...
Well, generic graph is becoming really really big with lots of tests.
For instance, do we really need the tests with for i in range(100):
. Can't we have only one test we randomly chosen number of edges? That should be enough.
David.
I'm afraid that all the attempts I have made were unsuccessful. The point is that importing module centrality.pyx
at startup is expensive, and I need it to import it to add method centrality_closeness
to GenericGraph
. Hence, the startup time becomes 2 seconds (too much).
In my opinion, this might be a reason why it is better to leave method centrality_closeness
in generic_graph
, and put only Cython-related methods inside centrality.pyx
, in order to use lazy imports. May I move method centrality_closeness back?
If we still want to move centrality_closeness
and related methods to centrality.pyx
, I think we should open another ticket, in order to find a way to use lazy imports (I have no idea how, though).
ok, move it back to generic graph. David.
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
1551c6f | Undo commit moving closeness_centrality to centrality.pyx. Corrected links. |
Reviewer: David Coudert
Now, the closeness centrality algorithm simply calls the corresponding NetworkX routine (only for undirected graphs).
This ticket will move this routine to file generic_graph (closeness centrality is defined also in this case ![1]), and it will modify input/output variables as in routine shortest_path. Then, it will call our shortest path new routines instead of NetworkX routines.
This ticket is a first step towards the inclusion of the algorithm in ![2].
![1] !http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6816651&tag=1
![2] !http://arxiv.org/pdf/1507.01490v1.pdf
Depends on #18931 Depends on #18876 Depends on #18910
CC: @nathanncohen @dcoudert
Component: graph theory
Keywords: Closeness centrality
Author: Michele Borassi
Branch/Commit:
1551c6f
Reviewer: David Coudert
Issue created by migration from https://trac.sagemath.org/ticket/19007