Closed pelegm closed 4 years ago
After make doc-clean
you should be using make doc
.
Works, thanks (took 2.5 hours, though...).
Some minor suggestions:
S(X)
with Vol(X)
(it's a more common notation)Other than that, everything looks great. You may switch to positive review after you apply the suggestions, or if you decide not to apply them.
Branch pushed to git repo; I updated commit sha1. New commits:
10c362e | 21942: better documentation |
Done!
Reviewer name?
Sorry
Reviewer: Peleg Michaeli
Hi, I apologize for yet another late response, but just wanted to ask: for graphs without edges (and at least 2 vertices) you set the Cheeger constant to be 1.
It is correct that the constant is not well defined for graphs without edges, but still I would maybe set it to be 0, to be consistent with the general idea that larger Cheeger constant means higher connectivity, and the specific idea that non-connected graphs have Cheeger constant 0.
In fact, if you agree with this then we may add a quick heuristic at the beginning of the method, which checks whether the graph is connected, and returns 0 otherwise.
I also agree that you should test that the input graph is connected.
Some remarks:
Why not cythonizing method vertex_isoperimetric_number
?
in file index.rst
, I think that the first initial of a ref needs a \
, like in .. [ADKF1970] \V. Arlazarov
At the top of file isoperimetric_inequalities.pyx
, you should add some words on the kind of methods that are in this file.
in graph.py
, method vertex_isoperimetric_number
: and the sum is taken over the subsets
-> and the minimum is taken over the subsets
in method cheeger_constant
, please add definition of \partial X
similarly, in edge_isoperimetric_number
, please define \partial S
from sage.rings import infinity
-> from sage.rings.infinity import Infinity
or from sage.rings.infinity import infinity
, I don't know which is best.
For vertex_isoperimetric_number
, it is not realistic to try to compute this value for large graphs. For up to 31 vertices, you can do in Cython:
from sage.graphs.graph_decompositions.fast_digraph cimport FastDigraph, compute_out_neighborhood_cardinality, popcount32
from sage.rings.integer cimport Integer
def vertex_isoperimetric_number_fast(self):
"""
"""
cdef FastDigraph FG = FastDigraph(self)
cdef int card, boundary, S
cdef int mc = FG.n // 2
cdef int p = FG.n
cdef int q = 0
for S in range(1, 2**FG.n):
card = popcount32(S)
if card <= mc:
boundary = compute_out_neighborhood_cardinality(FG, S)
if boundary * q < p * card:
p = boundary
q = card
return Integer(p) / Integer(q)
it iterates over all the non-empty subsets of vertices. popcount32(S)
counts the number of 1
's and so of vertices in the current subset S
, and compute_out_neighborhood_cardinality
computes the number of vertices in N(S)\S
.
sage: %time vertex_isoperimetric_number(G) # your method
CPU times: user 4.36 ms, sys: 1.76 ms, total: 6.13 ms
Wall time: 9.11 ms
3/2
sage: G = graphs.CompleteGraph(5)
sage: %time vertex_isoperimetric_number_fast(G)
CPU times: user 83 µs, sys: 5 µs, total: 88 µs
Wall time: 88.9 µs
3/2
sage: G = graphs.CycleGraph(10)
sage: %time vertex_isoperimetric_number_fast(G)
CPU times: user 114 µs, sys: 17 µs, total: 131 µs
Wall time: 117 µs
2/5
sage: G = graphs.CycleGraph(20)
sage: %time vertex_isoperimetric_number_fast(G)
CPU times: user 34.7 ms, sys: 90 µs, total: 34.8 ms
Wall time: 34.8 ms
1/5
sage: G = graphs.CycleGraph(30)
sage: %time vertex_isoperimetric_number_fast(G)
CPU times: user 33.3 s, sys: 121 ms, total: 33.4 s
Wall time: 33.9 s
2/15
We can certainly do something similar for the other methods. We just need a smart way to compute |\partial X| / |Vol(X)|
and |E(X)|
(which is not properly defined in the method) and |\partial S| / |S|
.
It is of course possible to add a parameter k
to compute the value only over all subsets of order k
(see https://en.wikipedia.org/wiki/Isoperimetric_inequality)
I am not 100% convinced by FastDigraph
:
why FastDiagraph
is not merged with base/static_dense_graph
? Both uses a bitset matrix and binary matrices are not limited to 32 vertices. Of course, if you have one limb you can perform some optimization but it would be nicer to have the same API (and use the same directory).
Using int
might not be the best option. The library stdint proposes a uint_fast32_t
that is exactly what you want here.
Many processors offer a popcount
that popcount32
is just ignoring.
With dense representation and bit operations one can indeed remove the inner loops from the already cythonized cheeger_constant
and edge_isoperimetric_number
. On the other hand, these methods
rebased on 8.3.beta4 + extra commit taking care of some of the remarks of David.
I'm not claiming that FastDigraph
is perfect. It was useful for implementing vertex separation and cutwidth. It can certainly be cleaned/improved/etc. (e.g., using __builtin_popcount
and uint_fast32_t
). However, for such algorithms, the extra cost of using bitsets is not justified since we can hardly go above 32 nodes: huge computation time and memory usage.
Here, if you think it is possible to compute the vertex_isoperimetric_number
on larger graphs, then you can use binary_matrix
. Another idea could be to use short_digraph
and the same structure as for other methods, and maintain the vertices in the boundary (e.g., use an array to store the number of edges from S
to each vertex outside S
, and increase/decrease vol
each time a cell becomes non-zero/zero).
I did some tests using static_dense_graph
, and so binary_matrix
, for cheeger_constant
and edge_isoperimetric_number
. My code is slightly faster for dense graphs but slower for sparse graphs. Not so interesting here.
There are a small improvements you can do:
vol
in cheeger_constant
: use an array to store the degree of vertices, and then, instead of doing vol += 1
inside the for loop, do vol += degree[u]
outside.E(S)
. E(S)
?bint
? You don't use bitset operations here, and memory usage is clearly not an issue.Branch pushed to git repo; I updated commit sha1. New commits:
8bcb497 | 21942: typo in doc |
I agree with your suggestions. Though, a much more needed improvement would be to be able to iterate through subsets S
of vertices with S
and V \ S
connected.
Branch pushed to git repo; I updated commit sha1. New commits:
83f5c80 | 21942: small optimization |
An algorithm for iterating over connected subgraphs is described here https://stackoverflow.com/questions/15658245/efficiently-find-all-connected-induced-subgraphs. May be we can adapt it...
At least, it is possible to avoid the recursion and so to have a connected subgraph iterator. Not clear how to get an efficient implementation. I'll think about it.
I have implemented an iterator over connected subgraphs is Cython. It is reasonably fast. I should open a ticket for it, but I don't know yet in which file to add it. Any suggestion ?
It can certainly help improving this ticket, although current implementations are quite good.
Replying to @dcoudert:
I have implemented an iterator over connected subgraphs in Cython. It is reasonably fast. I should open a ticket for it, but I don't know yet in which file to add it. Any suggestion ?
connected_subgraphs_iterator.pyx
? Please, cc me.
It can certainly help improving this ticket, although current implementations are quite good.
To improve even more the computation of Cheeger constant, there should be an iterator over connected subgraph so that the complement is also connected! I can comment on this when your ticket is open.
Are you ok setting this ticket in positive review as it is? It will serve as a base for further improvements.
Note that I can easily factor cheeger_constant
and edge_isoperimetric_number
(exact same computation except the small difference for vol
). It might be worth not duplicating code.
Replying to @videlec:
Note that I can easily factor
cheeger_constant
andedge_isoperimetric_number
(exact same computation except the small difference forvol
). It might be worth not duplicating code.
What let me hesitate is that we can cut branches in the backtracking and when to cut will differ. What do you think?
Do you plan to add cuts in a near future ? and is the current structure of the code appropriate to add cuts ? if not, you can factor the code.
Replying to @dcoudert:
Do you plan to add cuts in a near future ? and is the current structure of the code appropriate to add cuts ? if not, you can factor the code.
As my answers are "no" and "no" let me factor and add a note.
I spoke too fast: there are already two different cuts! By the definition of Cheeger constant, you cut when Vol(s) > m
while for the isoperimetric number you cut 2*|S| > n
(where m
and n
are as usual number of edges and number of vertices).
Right.
Don't forget to add sig_on
sig_off
so that we can kill computation if needed.
Branch pushed to git repo; I updated commit sha1. New commits:
50c1863 | 21942: code interruptible with sig_on/sig_off |
I have a much better implementation for vertex_isoperimetric_number
using the ideas from https://stackoverflow.com/questions/15658245/efficiently-find-all-connected-induced-subgraphs. You can use it on a multigraph with loops. No limit on the number of vertices, but of course the time complexity remains.
Roughly, it uses a non-recursive version of the algorithm to generate all connected subgraphs of order at most n/2
. I'm therefore using a binary matrix as a stack of bitsets.
from sage.rings.rational_field import QQ
from sage.rings.infinity import Infinity
from cysignals.signals cimport sig_on, sig_off
include "sage/data_structures/binary_matrix.pxi"
from sage.graphs.base.static_dense_graph cimport dense_graph_init
def vertex_isoperimetric_number_iter(G):
"""
"""
if G.is_directed():
raise ValueError("vertex-isoperimetric number is only defined on non-oriented graph")
if G.order() == 0:
raise ValueError("vertex-isoperimetric number not defined for the empty graph")
elif G.order() == 1:
return Infinity
elif not G.is_connected():
return QQ((0,1))
# We convert the graph to a static dense graph
cdef binary_matrix_t DG
dense_graph_init(DG, G)
cdef int n = G.order()
cdef int k = n / 2 # maximum size of a subset
sig_on()
# We use a stack of bitsets. For that, we need 3 bitsets per level with at
# most n/2 + 1 levels, so 3 * (n / 2) + 3 bitsets. We also need 1 bitset that
# we create at the same time, so so overall 3 * (n / 2) + 4 bitsets
cdef binary_matrix_t stack
binary_matrix_init(stack, 3 * (n / 2) + 4, n)
cdef bitset_t candidates = stack.rows[3 * (n / 2) + 3]
cdef bitset_t left # vertices not yet explored
cdef bitset_t current # vertices in the current subset
cdef bitset_t boundary # union of neighbors of vertices in current subset
cdef int l = 0
cdef int p = n
cdef int q = 0
cdef int c, b, v
# We initialize the first level
for v in range(3):
bitset_clear(stack.rows[v])
bitset_complement(stack.rows[0], stack.rows[0])
while l >= 0:
# We take the values at the top of the stack
left = stack.rows[l]
current = stack.rows[l + 1]
boundary = stack.rows[l + 2]
if bitset_isempty(current):
bitset_copy(candidates, left)
else:
bitset_and(candidates, left, boundary)
if bitset_isempty(candidates):
# We decrease l to pop the stack
l -= 3
# If the current set if non empty, we update the lower bound
c = bitset_len(current)
if c:
bitset_difference(boundary, boundary, current)
b = bitset_len(boundary)
if b * q < p * c:
p = b
q = c
else:
# Choose a candidate vertex v
v = bitset_first(candidates)
bitset_discard(left, v)
# Since we have not modified l, the bitsets for iterating without v
# in the subset current are now at the top of the stack, with
# correct values
if bitset_len(current) < k:
# We continue with v in the subset current
l += 3
bitset_copy(stack.rows[l], left)
bitset_copy(stack.rows[l + 1], current)
bitset_add(stack.rows[l + 1], v)
bitset_union(stack.rows[l + 2], boundary, DG.rows[v])
binary_matrix_free(stack)
binary_matrix_free(DG)
sig_off()
return QQ((p, q))
update milestone 8.3 -> 8.4
Does not apply anymore.
Work Issues: merge confict
Changed work issues from merge confict to merge conflict
Note that we now have a connected subgraph iterator #25553.
Branch pushed to git repo; I updated commit sha1. This was a forced push. New commits:
487bee9 | 21942: Cheeger constants |
(just a rebase, need to adapt the code to use the connected subgraph iterator)
@dcoudert: I would be happy to use the subgraph iterator from #25553 however
One possible way to proceed is to make a proper Cython class for the subgraph iterator so that we can grab information (at C level) at each step of the iteration.
What do you think?
Replying to @videlec:
@dcoudert: I would be happy to use the subgraph iterator from #25553 however
- it is only implemented for dense graphs
It's reasonably fast this way :P But of course we could implement a sparse graph counter part.
- the iterator carries many information that would be extremely useful for implementing isoperimetric constants (such as boundary)
- passing around lists of vertices (the output of the iterator) is a big waste One possible way to proceed is to make a proper Cython class for the subgraph iterator so that we can grab information (at C level) at each step of the iteration.
What do you think?
We can certainly add such a class if it's needed. However, you have to be very careful if you want to access extra data. May be it's easier to copy/paste the code of the iterator, so that you can use the information you want safely at the time you need it. The advantage of the class is to silently switch between sparse and dense representations.
Tickets still needing working or clarification should be moved to the next release milestone at the soonest (please feel free to revert if you think the ticket is close to being resolved).
Moving to 9.0. We can also work on a proper cython class for subgraph iterator if useful.
Ticket retargeted after milestone closed
Batch modifying tickets that will likely not be ready for 9.1, based on a review of the ticket title, branch/review status, and last modification date.
This ticket adds three methods to graphs to compute three variations of the Cheeger constant:
cheeger_constant
)edge_isoperimetric_number
)vertex_isoperimetric_number
)Note that the implemented Cheeger constant is also the conductance of the simple random walk on the graph.
CC: @videlec @JeremiasE @slel
Component: graph theory
Keywords: cheeger, MathExp2018
Author: Vincent Delecroix, David Coudert
Branch:
8389196
Reviewer: Peleg Michaeli, Frédéric Chapoton
Issue created by migration from https://trac.sagemath.org/ticket/21942