Doctest failures with networkx 3.2

tornaria commented 10 months ago

Steps To Reproduce

sage -t --random-seed=324850534885798131910668214652534087473 src/sage/graphs/generic_graph.py
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 5105, in sage.graphs.generic_graph.GenericGraph.cycle_basis
Failed example:
    g.cycle_basis()                                                       # needs networkx
Expected:
    [[1, 6, 8, 5, 0], [4, 9, 6, 8, 5, 0], [7, 9, 6, 8, 5],
     [4, 3, 8, 5, 0], [1, 2, 3, 8, 5, 0], [7, 2, 3, 8, 5]]
Got:
    [[6, 8, 5, 7, 9],
     [2, 3, 8, 5, 7],
     [4, 3, 8, 5, 7, 9],
     [4, 0, 5, 7, 9],
     [2, 1, 0, 5, 7],
     [6, 1, 0, 5, 7, 9]]
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 5111, in sage.graphs.generic_graph.GenericGraph.cycle_basis
Failed example:
    g.cycle_basis(output='edge')                                          # needs networkx
Expected:
    [[(1, 6, None), (6, 8, None), (8, 5, None), (5, 0, None),
     (0, 1, None)], [(4, 9, None), (9, 6, None), (6, 8, None),
     (8, 5, None), (5, 0, None), (0, 4, None)], [(7, 9, None),
     (9, 6, None), (6, 8, None), (8, 5, None), (5, 7, None)],
     [(4, 3, None), (3, 8, None), (8, 5, None), (5, 0, None),
     (0, 4, None)], [(1, 2, None), (2, 3, None), (3, 8, None),
     (8, 5, None), (5, 0, None), (0, 1, None)], [(7, 2, None),
     (2, 3, None), (3, 8, None), (8, 5, None), (5, 7, None)]]
Got:
    [[(6, 8, None), (8, 5, None), (5, 7, None), (7, 9, None), (9, 6, None)],
     [(2, 3, None), (3, 8, None), (8, 5, None), (5, 7, None), (7, 2, None)],
     [(4, 3, None),
      (3, 8, None),
      (8, 5, None),
      (5, 7, None),
      (7, 9, None),
      (9, 4, None)],
     [(4, 0, None), (0, 5, None), (5, 7, None), (7, 9, None), (9, 4, None)],
     [(2, 1, None), (1, 0, None), (0, 5, None), (5, 7, None), (7, 2, None)],
     [(6, 1, None),
      (1, 0, None),
      (0, 5, None),
      (5, 7, None),
      (7, 9, None),
      (9, 6, None)]]
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 5276, in sage.graphs.generic_graph.GenericGraph.minimum_cycle_basis
Failed example:
    sorted(g.minimum_cycle_basis(by_weight=True, algorithm='NetworkX'))   # needs networkx
Expected:
    [[1, 2, 3], [1, 2, 3, 4], [5, 6, 7]]
Got:
    [[2, 3, 1], [2, 3, 4, 1], [6, 7, 5]]
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 5278, in sage.graphs.generic_graph.GenericGraph.minimum_cycle_basis
Failed example:
    g.minimum_cycle_basis(by_weight=False, algorithm='NetworkX')          # needs networkx
Expected:
    [[1, 2, 3], [1, 3, 4], [5, 6, 7]]
Got:
    [[3, 4, 1], [2, 3, 1], [6, 7, 5]]
**********************************************************************
File "src/sage/graphs/generic_graph.py", line 5286, in sage.graphs.generic_graph.GenericGraph.minimum_cycle_basis
Failed example:
    sorted(g.minimum_cycle_basis(by_weight=False, algorithm='NetworkX'))  # needs networkx
Expected:
    [[1, 2, 3, 5], [3, 4, 5]]
Got:
    [[3, 4, 5], [5, 3, 2, 1]]
**********************************************************************
2 items had failures:
   2 of  30 in sage.graphs.generic_graph.GenericGraph.cycle_basis
   3 of  11 in sage.graphs.generic_graph.GenericGraph.minimum_cycle_basis
    [3829 tests, 5 failures, 16.94 s]
----------------------------------------------------------------------
sage -t --random-seed=324850534885798131910668214652534087473 src/sage/graphs/generic_graph.py  # 5 doctests failed
----------------------------------------------------------------------

Expected Behavior

test passes

Actual Behavior

test fails

Additional Information

The cycle_basis() function was changed in networkx 3.2: https://github.com/networkx/networkx/pull/6654

Environment

- **OS**: networkx 3.2
- **Sage Version**: 10.2.beta7 + python 3.12 patches

Checklist

[X] I have searched the existing issues for a bug report that matches the one I want to file, without success.
[X] I have read the documentation and troubleshoot guide

tornaria commented 10 months ago

@antonio-rojas @kiwifb @orlitzky @dcoudert any ideas how to handle this?

I.e. how to support networkx 3.2 and 3.1 simultaneously.

dcoudert commented 10 months ago

A solution could be to avoid printing the cycles, but it is interesting for the users to see what a solution looks like. At least, we could add a method to check that a solution is a valid cycle basis of the input graph.

kiwifb commented 10 months ago

I do like the idea to have a method checking the solution. You can mark the printed solution random but the important bit is the check you do next. I think it is important for all those tests where there is no real canonical solution (or you cannot easily get to it). If something can be done deterministically that's another story.

orlitzky commented 10 months ago

Just sort the entries :)

Having example output is nice but it's hard to do unless there's a canonical form for it. In this case we're just using lists and e.g. [1,2,3] != [3,1,2] so we'd have to encapsulate the return value in a class that prints them the same way. It's not impossible but it's a lot of work to fix this small issue (and might break other things).

A variation on that theme is to write something like,

sage: expected = Foo([1,2,3])
sage: actual = some_function()  # might return Foo([3,1,2])
sage: actual == expected
True

which still sort-of shows you the result, but does the comparison using Foo's class equality (which can be overridden).

Finally, there's the very ugly,

sage: expected = [soln1, soln2]  # the answer changed in networkx-3.2
sage: actual = some_function()
sage: actual in expected
True

whose saving grace is that it is not permanent: when no one is using networkx-3.1 anymore, it can be re-simplified.

dcoudert commented 10 months ago

I'm not sure how to efficiently check that a collection of cycles forms a valid cycle basis of a graph. Any suggestion is more than welcome ;)

tornaria commented 10 months ago

Since it seems hard, I settled on marking the output random. If someone comes up with a good way to test this can be done later, and/or it can be reverted to check output once we can require networkx 3.2.

Distros will be updating networkx and it doesn't seem useful to delay this.

dimpase commented 10 months ago

For the purpose of checking, why is it hard to build a proper $\mathbb{F}_2$-vectorspace of cycles? (Topologists do this every day :-)) It's even outlined in docstrings how to do it. But perhaps some code in sage/topology can be used, @jhpalmieri ?

jhpalmieri commented 10 months ago

I don't see anything really simple. First, graphs ought to have a _simplicial_ method, which would give an easy way to compute the corresponding object as a simplicial complex. Since they don't, you can do:

sage: g = graphs.PetersenGraph()
sage: K = SimplicialComplex(g.edges(labels=False))
sage: C1 = K.n_chains(1, GF(2))
sage: cycles = g.cycle_basis(output='edge')
sage: chains = [sum(C1(Simplex((s,t))) for (s,t,u) in cycle) for cycle in cycles]
sage: chains
[(0, 1) + (0, 5) + (1, 6) + (5, 8) + (6, 8),
 (0, 4) + (0, 5) + (4, 9) + (5, 8) + (6, 8) + (6, 9),
 (5, 7) + (5, 8) + (6, 8) + (6, 9) + (7, 9),
 (0, 4) + (0, 5) + (3, 4) + (3, 8) + (5, 8),
 (0, 1) + (0, 5) + (1, 2) + (2, 3) + (3, 8) + (5, 8),
 (2, 3) + (2, 7) + (3, 8) + (5, 7) + (5, 8)]
sage: [c.is_cycle() for c in chains]
[True, True, True, True, True, True]

At least this tells you that they are all cycles. To check that they're independent: the cycles are the same as the homology (since there are no boundaries), so you have to check linear independence. Here is one way:

sage: vecs = [c.to_vector() for c in chains]
sage: W = V[0].parent()
sage: W.linear_dependence(vecs)
[

]
sage: len(W.linear_dependence(vecs))
0

Or you could do matrix(vecs).rank() == len(vecs).

dimpase commented 10 months ago

In this case, cycles in the basis come from a spanning tree (9 edges: (0, 1) + (0,4) + (0, 5) + (5, 7) + (5, 8) + (6, 8) + (6, 9) + (3, 8) + (2, 3) ), so there must be 6 more (Petersen has 15=9+6 edges) edges occurring in a basis, and each of these 6 cycles has one unique edge, so you get linear independence without any linear algebra; one already 6 columns in the 6x15 matrix with exactly one non-0.

In general, one starts with a spanning tree, and for each edge (a,b) not in the tree one completes it to a cycle by adding the unique a-b path in the tree to it.

dimpase commented 10 months ago

So, what would be the preferences:

have a general function which checks that a bunch of cycles is a basis
have a special for this case function which finds the spanning tree (a forest, more precisely, for disconnected graphs) used to build the basis, and check that all the edges outside the tree are in exactly one cycle.

the advantage of 2 is that it does not need any linear algebra.

dcoudert commented 10 months ago

For the record, when the input graph is simple, we rely on the networks implementation. When it allows multiple edges, we use our own implementation that finds a spanning tree and then produces a short cycle for each edge not in the tree. Here the difficulty to check the validity of the solution is to distinguish parallel edges.

dimpase commented 10 months ago

I was a bit too optimistic about 2) - one needs to separately deal with biconnected components of the graph (one does not use the global spanning forest, one takes a spanning tree for each biconnected component).

I don't see however how multiple edges affect the story at all. Anyway, as the main goal of such a function would be to check the validity of networkx cycle bases, so no multiple edges come into the picture.

sagemath / sage