Open Etn40ff opened 7 years ago
To compare matrices in the canonical form you can then just use hashes.
well, on the level of the graphs, the dig6 string is basically the (invertible) hash, though you might be right that an ordinary hash might be faster and also working. I will recheck...
Replying to @stumpc5:
@
Salvatore: canonical form is up to simultaneous row and column permutation, not up to row and column permutation! This canonical form in particular solves graph isomorphism (two graphs are isomorphic if and only if their canonical form adjacency matrices coincide, observe that a simultaneous row and column permutation is a relabelling of the vertices of the graph) and the later is implemented and highly optimized in sage.There is no chance to provide anything speedwise comparable directly on matrices without reimplementing the above algorithm, I did quite some tests when implementing it and the outcome was that graph canonical form is by far fastest available.
I think you misunderstood the algorithm I was suggesting: 3 would be applied to rows and columns of B simultaneously, not to the sorted vectors in 2.
In any case I think I was being stupid for other reasons: the algorithm proposed basically sorts vertices according to their in and out degrees (or actually some finer version of this where we keep track of how many arrows from distinct vertices are adjacent to each node). This is not enough information to distinguish any two quivers. The first example that comes to mind is a "figure 8 quiver" with one vertex of degree 4 (2 in and 2 out) and 4 vertices of degree 2 (1 in and 1 out). The algorithm I had in mind can't distinguish the case with a 3 cycle attached to a 5 cycle from the case of two 4 cycles glued together.
Sorry about the stupid remark.
Replying to @Etn40ff:
I think you misunderstood the algorithm I was suggesting: 3 would be applied to rows and columns of B simultaneously, not to the sorted vectors in 2.
Oh yes, I did, sorry. Anyway, highly optimized graph canonical form algorithms are hard to beat with an algorithm that in particular solves graph canonical form...
Stern but just :D
Since this is my day for random stupid ideas here is another. Can't we use CAIV to speed up the process? It is known that the exchange graph associated to any quiver with any choice of coefficients is a quotient of the exchange graph associated to the same quiver but with principal coefficients.
Can we just pretend we always have principal coefficients(by artificially adding them) and then perform mutations on the matrices only? Deciding if two seeds coincide is just a matter of comparing the associated lists of c-vectors (and this is fast). If we care about the up to equivalence case we can always check for graph isomorphisms in a second time.
I am basically proposing to replace code duplication and digraphs with possibly more mutations but performed much faster.
This is what me and Dylan do in ClusterAlgebra
but there we never care about
the up to equivalence issue.
I am not sure I really follow how you want to do the check.
Please let me note that we want to list all quivers up to equivalence. The best way approach I know of is the following much more general algorithm using a canonical form (where two objects are considered equivalent if they have the same canonical form):
Observe that you must in particular compute the canonical form for each element you encounter.
I don't see how you could bypass this canonical form computation. In particular, I don't see how your approach could deal with this other than also computing it.
You are more than welcome to try to beat the current algorithm!
I would personally expect that the optimal reachable solution is to generalize the graph canonical form to a matrix canonical form. Citation from https://en.wikipedia.org/wiki/Graph_canonization: A commonly known canonical form is the lexicographically smallest graph within the isomorphism class, which is the graph of the class with lexicographically smallest adjacency matrix considered as a linear string. However, the computation of the lexicographically smallest graph is NP-hard.
We currently use sage.groups.perm_gps.partn_ref.refinement_graphs.search_tree
which implements McKay - Practical Graph Isomorphism, see in particular https://arxiv.org/abs/1301.1493.
Ok, let me try to explain a little better if I can. Suppose that we have a square exchange matrix B and consider its its principal framing Bp (i.e. B with the identity matrix attached to the bottom). Suppose for a moment that we want to list all the matrices in the mutation class of Bp up to simultaneous permutations of columns and (the first n) rows.
Call the columns of the bottom part of a matrix obtained by mutations from Bp its c-vectors.
Thm: B1 and B2 in the mutation class of Bp are equivalent if and only if the sets of their c-vectors coincide.
So here is the algorithm to list all the matrices mutationally equivalent to Bp: Start from Bp and perform mutations, at each step compare the set of c-vectors with the one already known. This completely solve the problem in the case of principal coefficients and is much faster than any other implementation using digraphs and canonical forms for comparisons.
Suppose now that Bf is any other rectangular matrix with the same B as the top part. Suppose that B1f is obtained from Bf by a certain sequence of mutations and suppose that B1 is obtained from Bp with the same sequence of mutations. Then, obviously, the top part of B1f and B1 coincide. Moreover there is a transformation (that depends only on Bf so it applies to the entire mutation class) that expresses the bottom part of B1f in terms of the c-vectors of B1. In particular if B1 and B2 are equivalent so do B1f and B2f for any choice of framing.
It remains to test the situation in which B1 and B2 are distinct but have equivalent top parts to see if B1f and B2f are equivalent. For cluster algebras this is an irrelevant issue and we just do not care in our implementation. Indeed the only relevant framing for the combinatorial structure of a cluster algebra is the principal one.
If one is only interested in quiver mutations it could be relevant. In any case it should be faster to compare using digraphs only the matrices the previous algorithm produce rather than all of them.
In short: to compute mutation classes we do not need to solve the graph isomorphism problem in full generality.
This completely solve the problem in the case of principal coefficients and is much faster than any other implementation using digraphs and canonical forms for comparisons.
This might be correct, but is not totally obvious. The canonical form tells which vertices are equivalent, so we do not need to mutate all, but only one per equivalence class of vertices (orbits of the isomorphism group).
Moreover there is a transformation (that depends only on Bf so it applies to the entire mutation class) that expresses the bottom part of B1f in terms of the c-vectors of B1. In particular if B1 and B2 are equivalent so do B1f and B2f for any choice of framing.
I see this, but don't we rather also need the other direction, since we want to solve isomorphism for Bf? Imagine Bf is only the top part, so there are no c-vectors. We now need to check isomorphism for Bf? Could you apply your algorithm (for me to understand) to find the unique quiver of type A2, and an example of how to find the four quivers of type A3? (In the first case, you first find 5 quivers starting with principal coefficients, and in the second you find 14 -- and I fail to see how these are reduced to 1 and 4, resp.)
If one is only interested in quiver mutations it could be relevant.
Isn't this complete thread and the algorithm about quiver mutations and not about cluster seed mutations? If I see your algorithm correctly, it would find the five non-equivalent quivers with principal coefficients, but how does this help to see that their top parts are actually all equivalent?
You can see in ClusterSeed.mutation_class_iter
that we do not do any graph isomorphism there and compare the cluster variables (instead of your c-vectors). We also perform matrix mutation in this case and no digraph mutation. What I thought we discuss in this thread is quiver mutation and quiver mutation class computations.
Replying to @stumpc5:
This completely solve the problem in the case of principal coefficients and is much faster than any other implementation using digraphs and canonical forms for comparisons.
This might be correct, but is not totally obvious. The canonical form tells which vertices are equivalent, so we do not need to mutate all, but only one per equivalence class of vertices (orbits of the isomorphism group).
Some tests of the current versions of ClusterQuiver
for the coefficient-free case (because I could not make the general case run) and ClusterAlgebraSeed
with principal coefficients
sage: A = ClusterAlgebra(['D',10], principal_coefficients=True)
sage: %time _ = list(A.seeds(mutating_F=False))
CPU times: user 1min 25s, sys: 525 ms, total: 1min 25s
Wall time: 1min 25s
sage: Q = ClusterQuiver(['D',10])
sage: %time _ = list(Q.mutation_class_iter())
CPU times: user 1min 18s, sys: 64.5 ms, total: 1min 18s
Wall time: 1min 18s
show that the two are comparable in finite types.
Actually implementing what I suggest is worse because then one has to further
reduce the output according to coeffieicents.
I do not know how well ClusterQuiver
scales as the number of coefficients
grows (assuming we can make it work).
Beyond finite types things are different because there are not as many symmetries. For example
sage: B = matrix(10, lambda i,j: 1 if random() <0.5 else 0)
sage: B -= B.transpose()
sage: Q = ClusterQuiver(B)
sage: mut_class = Q.mutation_class_iter()
sage: count = 0
sage: %time while count < 100000: _ = mut_class.next(); count +=1
CPU times: user 3min 32s, sys: 13.7 ms, total: 3min 32s
Wall time: 3min 33s
sage: A = ClusterAlgebra(B)
sage: mut_class = A.seeds(mutating_F=False)
sage: count = 0
sage: %time while count < 100000: _ = mut_class.next(); count +=1
CPU times: user 51.4 s, sys: 491 ms, total: 51.9 s
Wall time: 52.1 s
Moreover there is a transformation (that depends only on Bf so it applies to the entire mutation class) that expresses the bottom part of B1f in terms of the c-vectors of B1. In particular if B1 and B2 are equivalent so do B1f and B2f for any choice of framing.
I see this, but don't we rather also need the other direction, since we want to solve isomorphism for Bf? Imagine Bf is only the top part, so there are no c-vectors. We now need to check isomorphism for Bf? Could you apply your algorithm (for me to understand) to find the unique quiver of type A2, and an example of how to find the four quivers of type A3? (In the first case, you first find 5 quivers starting with principal coefficients, and in the second you find 14 -- and I fail to see how these are reduced to 1 and 4, resp.)
Just run the same algorithm you run now: digraph isomorphism. In this example you point out my suggestion is way more expensive than the one you implemented: I have to find 5 quivers and then test all of them to see that they coincide. On a not so symmetric case though my approach could save quite some time.
If one is only interested in quiver mutations it could be relevant.
Isn't this complete thread and the algorithm about quiver mutations and not about cluster seed mutations? If I see your algorithm correctly, it would find the five non-equivalent quivers with principal coefficients, but how does this help to see that their top parts are actually all equivalent?
I am just saying we could consider the possibility to derive the algorithm for quivers out of the algorithm for cluster algebras with (possibly?) a speedup. Of course I am not even sure this is worth doing.
sage: A = ClusterAlgebra(['D',10], principal_coefficients=True)
sage: %time _ = list(A.seeds(mutating_F=False))
Does this use matrix mutation, or another mutation algorithm you implemented?
Replying to @stumpc5:
sage: A = ClusterAlgebra(['D',10], principal_coefficients=True) sage: %time _ = list(A.seeds(mutating_F=False))
Does this use matrix mutation, or another mutation algorithm you implemented?
A bit of both: we mutate the top part by sage.matrix.matrix0.Matrix.mutate
and the bottom part by hand. This is because we just wanted to keep c-vectors separate. We even perform several other computations that are needed for cluster algebras and that can be easily dropped if one only cares about quivers.
If one wanted to be smarter this could be entirely done with your cytonized code.
If one wanted to be smarter this could be entirely done with your cytonized code.
So let me rephrase to see if I understand correctly and whether we are on the same page:
The best version of your suggestion is that we could compute the complete mutation class of a symmetrizable matrix using the cythonized mutate
method in matrix0.pyx
and then compute all canonical forms after.
I will try to produce something similar to this and see how it compares...
Yes, this is more or less what I was thinking of.
If the canonical form is able to distinguish also quivers with coefficients I would suggest this algorithm:
B input matrix possibly with coefficients
Append an identity matrix to B
perfome mutations with cytonized mutate on the resulting matrix and use the last n entries in each column to distinguish matrices.
return canonical form of the resulting matrices.
This last step is not necessary if, at some stage, we find an identity matrix among the possibly permuted coefficient rows of a matrix. I am not sure it is worth testing for this though.
Attachment: 2018_03-CythonTestMatrixMutation.py.gz
I have provided a toy example how to do the mutations on matrices instead at https://github.com/sagemath/sage/files/ticket22381/2018_03-CythonTestMatrixMutation.py..gz
It is what I think is an improved version of what Salvatore and I discussed yesterday:
I do not add any artificial principal coefficients but immediately do all possible matrix mutations and check for isomorphisms along the way. Reason: we have to canonicalize all matrices in the end anyways to check for duplicates. It is thus faster to do the canonicalization immediately after finding it and discarding if already existing (so we save additional mutations).
This is now ~2 times faster that what I had before using digraph mutation, and not yet completely optimized.
On the downside, more than 50% of the time is already spent in the canonical form cython function, so there is not much improvement possible currently.
On the upside, this new code is much cleaner, since I never canonicalize the matrices themselves, but only save the hashes of their canonical forms.
Despite this code, I am not convinced it is worth changing all of mutation_class
to use this new code given that I haven't used it in years and I would certainly takes days to get this done + proper debugging afterwards.
What do you think how to proceed here?
Attachment: 2018_03-CythonTestMatrixMutation.2.py.gz
Okay, I changed to a different canonical labelling algorithm (namely bliss
) which speeded by another factor of ~2, see https://github.com/sagemath/sage-prod/files/10658986/2018_03-CythonTestMatrixMutation.2.py.gz . Additional advantage now is that only 10% of the time is spent on the canonicalization, so there are multiple places for improvements.
(The reason for this time improvement is that I don't care anymore about the automorphism group since matrix mutation is fast and I can afford to do all mutations at all vertices rather than knowing the automorphism group orbits.)
Attachment: 2018_03-CythonTestMatrixMutation.3.py.gz
The newest version https://github.com/sagemath/sage-prod/files/10658987/2018_03-CythonTestMatrixMutation.3.py.gz is now in total 6 times faster than the original implementation. Now, 1/3 of the time is spent in the bliss
method, and 1/2 of the time is spent preparing for bliss
.
@
Salvatore, I have a question concerning
Thm: B1 and B2 in the mutation class of Bp are equivalent if and only if the sets of their c-vectors coincide.
and in particular
This last step is not necessary if, at some stage, we find an identity matrix among the possibly permuted coefficient rows of a matrix. I am not sure it is worth testing for this though.
This seems not correct, since I believe you need to consider c-vectors up to index permutations:
Consider bipartite A3 with principal coefficients and mutable vertices 0,1,2. Then vertices 0 and 2 form an orbit of its graph automorphism group and mutating at 0 and mutating at 2 yield the same quiver, while in the first case, you have c-vectors
[(1, 0, 0), (0, 1, 1), (0, 0, -1)]
and in the second case
[(-1, 0, 0), (1, 1, 0), (0, 0, 1)]
which do not coincide.
Am I correct that there is no situation where you can forget about the isomorphism test, or am I misunderstanding what you meant?
In type A3, we have 10 different unlabelled quivers, while your algorithm found 14 = Cat(4).
Replying to @stumpc5:
@
Salvatore, I have a question concerningThm: B1 and B2 in the mutation class of Bp are equivalent if and only if the sets of their c-vectors coincide.
and in particular
This last step is not necessary if, at some stage, we find an identity matrix among the possibly permuted coefficient rows of a matrix. I am not sure it is worth testing for this though.
This seems not correct, since I believe you need to consider c-vectors up to index permutations:
Consider bipartite A3 with principal coefficients and mutable vertices 0,1,2. Then vertices 0 and 2 form an orbit of its graph automorphism group and mutating at 0 and mutating at 2 yield the same quiver, while in the first case, you have c-vectors
[(1, 0, 0), (0, 1, 1), (0, 0, -1)]
and in the second case
[(-1, 0, 0), (1, 1, 0), (0, 0, 1)]
which do not coincide.
Am I correct that there is no situation where you can forget about the isomorphism test, or am I misunderstanding what you meant?
In type A3, we have 10 different unlabelled quivers, while your algorithm found 14 = Cat(4); how did you propose to get the 10 graph (or b-matrix) isomophism types out of this?
Hi Christian, sorry for being AFK for some time: this is quite a busy period so I am not as responsive as I should.
Concerning your question I guess you are right: from the perspective of (valued) quiver up to isomorphism you should allow also index permutations of the c-vectors. The thing I had in mind was quiver isomorphism relative to fixed principal coefficients. In short my suggestion was again bullshit and you should just disregard it.
I am glad that my nonsense somehow lead you to a faster/cleaner implementation anyway.
As for what you say concerning a clean implementation to be added into sage I am more or less in the same boat as you. I have not used this code in years. The only thing I currently use of it is the ability to transform various data into an exchange matrix, afterwards I only use the code we implemented with Dylan. I might put some effort into reviewing a ticket should it appear but I am not really motivated to write one myself.
Branch pushed to git repo; I updated commit sha1. New commits:
9df40f0 | working on a cleaner and faster mutation class |
I pushed what I currently have, but put this ticket on hold until #24917 is resolved.
Branch pushed to git repo; I updated commit sha1. New commits:
3a1560f | playing with a new bliss interface using matrix_to_edge_list |
Branch pushed to git repo; I updated commit sha1. New commits:
c07c97c | working on an generic implementation of canonical form for labelled graphs |
Dependencies: #24924
Branch pushed to git repo; I updated commit sha1. New commits:
ff80980 | first version of bliss canonical form from edge list |
7ea99bd | first version of bliss graph from labelled edges |
3f07264 | fixed typo, added labelled edge return and fixed bug in relabelling |
4661b86 | playing the labelled graphs for now |
b9883f9 | Merge branch 't/24924/making_the_bliss_canonical_form_available_for_edge_labelled_graphs' into t/22381/public/22381 |
Branch pushed to git repo; I updated commit sha1. New commits:
97bb60b | merged newest develop |
Branch pushed to git repo; I updated commit sha1. New commits:
c719703 | trivial merge |
The code performing mutations on
ClusterQuiver
does not handle properly coefficients. In particular this introduces arrows in between frozen vertices yielding issues when computing mutation classes. Here is the smallest example I could come up with.The problem is that, rather than performing matrix mutations the code calls an helper function performing digraph mutations. I do not see the rationale behind this. Not only it is slower than the simple matrix operations needed to perform mutations on matrices but it is also giving issues.
Here is another similar issue:
Note that consistency tests (which in theory should not be needed and which contribute to the overall slowdown) are performed in
__init__
only when creating aClusterQuiver
from a digraph.Finally coefficients also force wrong answers from the algorithm computing mutations classes:
Depends on #24924
CC: @sagetrac-gmoose05 @stumpc5 @sagetrac-drupel @sagetrac-vpilaud @slel
Component: combinatorics
Keywords: ClusterSeed, mutations, cluster
Stopgaps: #22464
Author: Frédéric Chapoton
Branch/Commit: public/22381 @
c719703
Issue created by migration from https://trac.sagemath.org/ticket/22381