sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.35k stars 460 forks source link

Bugs in how ClusterQuiver handles coefficients #22381

Open Etn40ff opened 7 years ago

Etn40ff commented 7 years ago

The code performing mutations on ClusterQuiver does not handle properly coefficients. In particular this introduces arrows in between frozen vertices yielding issues when computing mutation classes. Here is the smallest example I could come up with.

sage: B = matrix(3,[0,1,-1]); B
[ 0]
[ 1]
[-1]
sage: Q = ClusterQuiver(B)
sage: list(Q.mutation_class())
Traceback (most recent call last):
...
ValueError: The input digraph contains edges within the frozen vertices

The problem is that, rather than performing matrix mutations the code calls an helper function performing digraph mutations. I do not see the rationale behind this. Not only it is slower than the simple matrix operations needed to perform mutations on matrices but it is also giving issues.

Here is another similar issue:

sage: B = matrix(3,[0,1,-1]); B
[ 0]
[ 1]
[-1]
sage: Q = ClusterQuiver(B)
sage: Q.mutate(0)
sage: Q.b_matrix()  # this gives the expected output
[ 0]
[-1]
[ 1]
sage: Q.show()      # this output is wrong!!!

Note that consistency tests (which in theory should not be needed and which contribute to the overall slowdown) are performed in __init__ only when creating a ClusterQuiver from a digraph.

Finally coefficients also force wrong answers from the algorithm computing mutations classes:

sage: B = Matrix([[0,1,0],[-1,0,1],[0,-1,0],[0,1,0],[0,1,-1],[0,-1,0]])
sage: Q = ClusterQuiver(B)
sage: Q.mutation_class()
Traceback (most recent call last):
...
ValueError: The mutation class can - for infinite mutation types - only be computed up to a given depth

Depends on #24924

CC: @sagetrac-gmoose05 @stumpc5 @sagetrac-drupel @sagetrac-vpilaud @slel

Component: combinatorics

Keywords: ClusterSeed, mutations, cluster

Stopgaps: #22464

Author: Frédéric Chapoton

Branch/Commit: public/22381 @ c719703

Issue created by migration from https://trac.sagemath.org/ticket/22381

Etn40ff commented 7 years ago
comment:1

Actually a better example for the third issue:

sage: B = Matrix([[0,1,0],[-1,0,1],[0,-1,0],[2,0,0]])
sage: Q = ClusterQuiver(B)
sage: Q.is_mutation_finite()
False
Etn40ff commented 7 years ago

Branch: public/ticket/22381

Etn40ff commented 7 years ago

Commit: 67e6b5b

Etn40ff commented 7 years ago

New commits:

67e6b5bAdd stopgap for trac:22381
Etn40ff commented 7 years ago

Changed commit from 67e6b5b to none

Etn40ff commented 7 years ago

Changed branch from public/ticket/22381 to none

Etn40ff commented 7 years ago

Stopgaps: 22464

slel commented 7 years ago

Changed stopgaps from 22464 to #22464

fchapoton commented 6 years ago

Branch: u/chapoton/22381

fchapoton commented 6 years ago

New commits:

bf0e950trac 22381 do not introduce arrows between frozen vertices
fchapoton commented 6 years ago

Commit: bf0e950

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from bf0e950 to 4cef3ac

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

4cef3acanother py3-related change
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

d8358bbother py3 details
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 4cef3ac to d8358bb

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from d8358bb to 45607b1

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

45607b1another py3 change
fchapoton commented 6 years ago

Author: Frédéric Chapoton

Etn40ff commented 6 years ago
comment:12

Thank you for dealing with this Frédéric. I wanted to help with the review but I have a quite old version of sage right now and recompiling I bumped into #24575. I'll return to this once that is fixed. If anyone gets to it before I do please do not feel obliged to wait for me.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

ce75cfcmore py3-compatibility details
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 45607b1 to ce75cfc

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from ce75cfc to 01cb2e9

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

01cb2e9yet another py3 enhancement
Etn40ff commented 6 years ago
comment:15

I finally managed to recompile sage and I gave a quick spin to this patch. Something seems to be still wrong as it still fails on type A_2 cluster algebras:

sage: B = Matrix([[0,1,0],[-1,0,1],[0,-1,0],[0,1,0],[0,1,-1],[0,-1,0]])
sage: Q = ClusterQuiver(B)
sage: Q.mutation_class()
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-64d1287a18f5> in <module>()
----> 1 Q.mutation_class()

/opt/sage/local/lib/python2.7/site-packages/sage/combinat/cluster_algebra_quiver/quiver.pyc in mutation_class(self, depth, show_depth, return_paths, data_type, up_to_equivalence, sink_source)
   1883         """
   1884         if depth is infinity and not self.is_mutation_finite():
-> 1885             raise ValueError('the mutation class can - for infinite mutation'
   1886                              ' types - only be computed up to a given depth')
   1887         return [Q for Q in self.mutation_class_iter(depth=depth, show_depth=show_depth,

ValueError: the mutation class can - for infinite mutation types - only be computed up to a given depth

While having a look I could not avoid asking myself what is the point of _digraph_mutate. It sounds like an extremely convoluted way of performing mutations. Dealing with matrices instead should be as simple as

def mutate(B, k):
    m = B.nrows()
    n = B.ncols()
    M = matrix(m,n, lambda i,j: -B[i,j] if i==k or j==k else B[i,j] + (abs(B[i,k])*B[k,j] + B[i,k]*abs(B[k,j]))/2 )
    return M

Is there any speed advantage in dealing with digraphs? Can some of the original contributors please comment on this point?

fchapoton commented 6 years ago
comment:17

It seems that _digraph_to_dig6 and _dig6_to_digraph in sage.combinat.cluster_algebra_quiver.mutation_class are only dealing seriously with the case of no-coefficients. Always creating a square matrix..

Etn40ff commented 6 years ago
comment:18

I just got an e-mail from Gregg. He basically says that the point of digraph is to be able to label the vertices. Other than that matrices should be sufficient and faster.

The only other things that comes to mind is that Christian might have done some optimization to identify mutation classes and this might be based on digraphs. It is probably the cheapest way to check for isomorphism of (valued) quivers. Should we ask him explicitly to comment on this?

My impression is that there are now two competing codebases for doing mutations, one with graphs and one with matrices, that somehow are simultaneously in ClusterQuiver there should be quite a lot of room for cleanup. This does not sound like a short term project though.

fchapoton commented 6 years ago
comment:19

I think most of the mutation_type and mutation_finite code is made for the coefficient-free case and will be hard work to update. Even in your example of type A3 with coeffs, after being convinced to start mutating, it seems to find an infinite number of quivers...

I would suggest:

(1) close this ticket with the improvements it brings, including progress towards py3

(2) open another one for the mutation type and finiteness issues.

fchapoton commented 6 years ago

Changed keywords from ClusterSeed, mutations to ClusterSeed, mutations, cluster

Etn40ff commented 6 years ago
comment:21

I agree this is not going to be a quick fix and there is no point in delaying py3. The only thing is that we should really make sure that the stopgap #22464 is also updated to the new ticket.

stumpc5 commented 6 years ago
comment:22

Hi,

this whole code clusterseed / quiver code seems to be a disaster, I wonder how much of it was actually broken from my first version of its code -- sorry at least for these originally broken parts!

Is there any speed advantage in dealing with digraphs? Can some of the original contributors please comment on this point?

I have certainly spent a lot of time optimizing mutations, and used matrix mutation written in cython whereever possible. I believe, I have even used this for quiver mutations, but I am not totally sure.

I will try to give a slightly more informative comment tomorrow, hopefully after I had a look at the original code...

stumpc5 commented 6 years ago
comment:23

Okay, it seems that the quiver code is indeed broken for frozen variables since a long time, maybe even from the beginning (we wrote the code initially without frozen variables), and we even used _digraph_mutate from the beginning. I don't see any reason for using _digraph_mutate but I don't recall why we used it in the first place either.

The b matrix mutation is cythonized from the beginning and was then as fast as possible, see the mutate method matrix/matrix0.pyx: def mutate(self, Py_ssize_t k ).

I think I agree that digraph_to_dig6 and dig6_to_digraph only deal with no coefficients. I actually don't think this is used at all in practice. Originally it is used to store mutation classes of quivers without coefficients for faster type recognition (and there, one never deals with coefficients).

The type recognition for seeds and quivers is done by taking the quiver without coefficients and determining its type by a nightmare case-analysis algorithm (I wrote it and knew from the beginning that it will be impossible to properly debug or understand, but it worked for all types...).

The mutation_class_iter method seems to be a horrible code duplication, done either for quivers (where _digraph_mutate is used in mutation_class line 253) or for seeds (where matrix mutation is used in cluster_seed line 3566).

I would suggest to fix the broken quiver mutation method, and then first run tests if that fixes the issues with quivers, or if there are any further issues. And only then start improving the code, if desired -- though I had in mind that Salvatore's new stuff is anyway much better from all perspectives...

fchapoton commented 6 years ago
comment:24

I would suggest moving my branch to another ticket "various enhancements to cluster quivers". Would somebody review that ticket then ?

stumpc5 commented 6 years ago
comment:25

"various enhancements to cluster quivers"

I will do a review of the code you provided. In principle, I find it much simpler to review changes that only have a single purpose, so having the formatting and py3 changes separated from the changes to the quiver mutation.

I would suggest, you move all formatting changes into another ticket which could go directly, and keep the actual behaviour modifications to _digraph_mutate so I can have a closer look at these and see how fixing the mutation issue there also fixes other issues we encountered.

fchapoton commented 6 years ago

Changed commit from 01cb2e9 to bf0e950

fchapoton commented 6 years ago

Changed branch from u/chapoton/22381 to public/22381

fchapoton commented 6 years ago
comment:26

Here is a branch only changing _digraph_mutate. It DOES NOT fix any issue concerning mutation type and mutation finiteness.

fchapoton commented 6 years ago
comment:27

There remains a few other minor corrections. It would be more complicated to separate them away, as they were in the same commit.

fchapoton commented 6 years ago
comment:28

extracted ticket at #24882

fchapoton commented 6 years ago
comment:29

EDIT: wrong ticket, sorry

stumpc5 commented 6 years ago
comment:31

I will post here whatever I encounter in the code, so that I keep track of what I understand, and for others to read and comment on.

The only other things that comes to mind is that Christian might have done some optimization to identify mutation classes and this might be based on digraphs.

The big issue is that we consider quivers often as unlabelled graphs, see the up_to_equivalence=True flag in mutation_class._mutation_class_iter.

There are thus multiple issues to keep track of:

  1. getting quivers into canonical form (by relabelling the vertices into a canonical ordering) and mutate at as least as possible vertices:

    • this is done in _dg_canonical_form, where the input quiver is put into a canonical form, while the relabelling of the vertices and the orbits of vertices under the isomorphism group is also collected.
    • this done, we only need to mutate at one vertex per orbit, and keep track that of the relabelling.
  2. checking if we already have encountered a given quiver.

    • this is done using its dig6 string because checking if a string is contained in a set of strings is infinitely much faster than checking if a digraph is contained in a set/list of digraphs.

None of this was ever properly done for using frozen vertices.

@Frederic: am i correct that you now properly do the mutation with frozen vertices in _digraph_mutate ?

I hope things will work out again once also the dig6 stuff is properly updated what I will do now.

fchapoton commented 6 years ago
comment:32

NOTE: I assumed in my changes that the frozen vertices are always the final one, labelled from n+1 to n+m. I am unsure if this is always the case.

stumpc5 commented 6 years ago
comment:33

Replying to @fchapoton:

NOTE: I assumed in my changes that the frozen vertices are always the final one, labelled from n+1 to n+m. I am unsure if this is always the case.

same here, I also just wondered if we can even assume the vertex labellings to be integers or numbers 0 through n+m-1. I really don't know how all this changed. The attributes Q._nlist and Q._mlist might suggest that other vertex labellings are possible.

stumpc5 commented 6 years ago
comment:34

The reason to use _digraph_mutate over matrix mutation is that the canonical form algorithm

    from sage.groups.perm_gps.partn_ref.refinement_graphs import search_tree, get_orbits

takes a digraph, so

  1. starting with a quiver Q
  2. taking its b matrix B
  3. mutating the b matrix to B'
  4. turning B' into a new quiver Q' (observe that Q and Q' largely coincide)
  5. computing the canonical form of Q'

is much slower (especially for large quivers where the coincidence in 4. is even bigger) than directly doing the digraph mutation.

Observe that the actual mutation is indeed much faster on matrices, so having an equally fast version of the canonical form directly on matrices (take a square matrix and bring it into a canonical form up to simultaneous row- and column permutations) would be also solving the problem.

stumpc5 commented 6 years ago
comment:35

Given

sage: D = ClusterQuiver(['A',2])._digraph
sage: D.relabel({0:"A",1:frozenset([1,2,"A"])})
sage: ClusterQuiver(D)._digraph.edges()
[('A', frozenset({1, 2, 'A'}), (1, -1))]

I must say that I am not willing to take this into account. I believe that _digraph should always be a digraph with vertices range(n+m) with mutable vertices range(n) and frozen vertices range(n+1,n+m) (*).

I believe this extra layer should be handled separately at a "high level" and not changing the backend algorithms of quivers. All mutation class and mutation type code is written with this assumption (*).

stumpc5 commented 6 years ago
comment:36

Okay, there are plenty of errors, I have fixed a few, but cannot push currently:

sage: Q = ClusterQuiver(['B',2]).principal_extension()
sage: Q.mutate(0)
sage: Q.b_matrix()
[ 0 -1]
[ 2  0]
[-1  1]
[ 0  1]
sage: Q._digraph.edges(labels=True)
[(0, 2, (1, -1)), (1, 0, (2, -1)), (2, 1, (1, -2)), (3, 1, (1, -1))]
The label for (2,1) should be (1,-1) if I see it correctly. (not fixed)
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

e0302f7cosmetical changes
2bc91b5fixed ClusterQuiver.canonical_label
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from bf0e950 to 2bc91b5

Etn40ff commented 6 years ago
comment:38

Replying to @stumpc5:

Observe that the actual mutation is indeed much faster on matrices, so having an equally fast version of the canonical form directly on matrices (take a square matrix and bring it into a canonical form up to simultaneous row- and column permutations) would be also solving the problem.

What about the following for a canonical form of a (coefficient-free) exchange matrix?

1) take the columns of B and sort each of them

2) sort the resulting lists lexicographically

3) 2 gives a permutation, use this permutation on rows and columns of B

To compare matrices in the canonical form you can then just use hashes.

stumpc5 commented 6 years ago
comment:39

I did a few tests if it is possible to do the quiver mutation using matrices, and I got

sage: %timeit M.mutate(1)
100000 loops, best of 3: 2.77 µs per loop
sage: %timeit _digraph_mutate(D,1,5,5)
10000 loops, best of 3: 37.8 µs per loop
sage: %timeit digraph_mutate_new(D,1,5,5)
The slowest run took 611.92 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 59.2 µs per loop

The later turns the digraph into a matrix and back in cython, and I don't see how to improve this my much since the time is almost completely spend in reading the edges of the input graph, creating the matrix, and writing the output graph.

stumpc5 commented 6 years ago
comment:40

@Salvatore: canonical form is up to simultaneous row and column permutation, not up to row and column permutation! This canonical form in particular solves graph isomorphism (two graphs are isomorphic if and only if their canonical form adjacency matrices coincide, observe that a simultaneous row and column permutation is a relabelling of the vertices of the graph) and the later is implemented and highly optimized in sage.

There is no chance to provide anything speedwise comparable directly on matrices without reimplementing the above algorithm, I did quite some tests when implementing it and the outcome was that graph canonical form is by far fastest available.