Closed tornaria closed 10 months ago
@antonio-rojas @kiwifb @orlitzky @dcoudert any ideas how to handle this?
I.e. how to support networkx 3.2 and 3.1 simultaneously.
A solution could be to avoid printing the cycles, but it is interesting for the users to see what a solution looks like. At least, we could add a method to check that a solution is a valid cycle basis of the input graph.
I do like the idea to have a method checking the solution. You can mark the printed solution random
but the important bit is the check you do next. I think it is important for all those tests where there is no real canonical solution (or you cannot easily get to it). If something can be done deterministically that's another story.
Just sort the entries :)
Having example output is nice but it's hard to do unless there's a canonical form for it. In this case we're just using lists and e.g. [1,2,3] != [3,1,2]
so we'd have to encapsulate the return value in a class that prints them the same way. It's not impossible but it's a lot of work to fix this small issue (and might break other things).
A variation on that theme is to write something like,
sage: expected = Foo([1,2,3])
sage: actual = some_function() # might return Foo([3,1,2])
sage: actual == expected
True
which still sort-of shows you the result, but does the comparison using Foo
's class equality (which can be overridden).
Finally, there's the very ugly,
sage: expected = [soln1, soln2] # the answer changed in networkx-3.2
sage: actual = some_function()
sage: actual in expected
True
whose saving grace is that it is not permanent: when no one is using networkx-3.1 anymore, it can be re-simplified.
I'm not sure how to efficiently check that a collection of cycles forms a valid cycle basis of a graph. Any suggestion is more than welcome ;)
Since it seems hard, I settled on marking the output random. If someone comes up with a good way to test this can be done later, and/or it can be reverted to check output once we can require networkx 3.2.
Distros will be updating networkx and it doesn't seem useful to delay this.
For the purpose of checking, why is it hard to build a proper $\mathbb{F}_2$-vectorspace of cycles? (Topologists do this every day :-)) It's even outlined in docstrings how to do it. But perhaps some code in sage/topology can be used, @jhpalmieri ?
I don't see anything really simple. First, graphs ought to have a _simplicial_
method, which would give an easy way to compute the corresponding object as a simplicial complex. Since they don't, you can do:
sage: g = graphs.PetersenGraph()
sage: K = SimplicialComplex(g.edges(labels=False))
sage: C1 = K.n_chains(1, GF(2))
sage: cycles = g.cycle_basis(output='edge')
sage: chains = [sum(C1(Simplex((s,t))) for (s,t,u) in cycle) for cycle in cycles]
sage: chains
[(0, 1) + (0, 5) + (1, 6) + (5, 8) + (6, 8),
(0, 4) + (0, 5) + (4, 9) + (5, 8) + (6, 8) + (6, 9),
(5, 7) + (5, 8) + (6, 8) + (6, 9) + (7, 9),
(0, 4) + (0, 5) + (3, 4) + (3, 8) + (5, 8),
(0, 1) + (0, 5) + (1, 2) + (2, 3) + (3, 8) + (5, 8),
(2, 3) + (2, 7) + (3, 8) + (5, 7) + (5, 8)]
sage: [c.is_cycle() for c in chains]
[True, True, True, True, True, True]
At least this tells you that they are all cycles. To check that they're independent: the cycles are the same as the homology (since there are no boundaries), so you have to check linear independence. Here is one way:
sage: vecs = [c.to_vector() for c in chains]
sage: W = V[0].parent()
sage: W.linear_dependence(vecs)
[
]
sage: len(W.linear_dependence(vecs))
0
Or you could do matrix(vecs).rank() == len(vecs)
.
In this case, cycles in the basis come from a spanning tree (9 edges: (0, 1) + (0,4) + (0, 5) + (5, 7) + (5, 8) + (6, 8) + (6, 9) + (3, 8) + (2, 3) ), so there must be 6 more (Petersen has 15=9+6 edges) edges occurring in a basis, and each of these 6 cycles has one unique edge, so you get linear independence without any linear algebra; one already 6 columns in the 6x15 matrix with exactly one non-0.
In general, one starts with a spanning tree, and for each edge (a,b) not in the tree one completes it to a cycle by adding the unique a-b path in the tree to it.
So, what would be the preferences:
the advantage of 2 is that it does not need any linear algebra.
For the record, when the input graph is simple, we rely on the networks implementation. When it allows multiple edges, we use our own implementation that finds a spanning tree and then produces a short cycle for each edge not in the tree. Here the difficulty to check the validity of the solution is to distinguish parallel edges.
I was a bit too optimistic about 2) - one needs to separately deal with biconnected components of the graph (one does not use the global spanning forest, one takes a spanning tree for each biconnected component).
I don't see however how multiple edges affect the story at all. Anyway, as the main goal of such a function would be to check the validity of networkx cycle bases, so no multiple edges come into the picture.
Steps To Reproduce
Expected Behavior
test passes
Actual Behavior
test fails
Additional Information
The cycle_basis() function was changed in networkx 3.2: https://github.com/networkx/networkx/pull/6654
Environment
Checklist