sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.37k stars 466 forks source link

Make user_label for ClusterQuiver an extra layer not touching any internal structure #24917

Open stumpc5 opened 6 years ago

stumpc5 commented 6 years ago

According to git blame, the user labels were introduced in 2015/16 by Ben Strasser and Gregg Musiker. The current implementation creates several issues with the internal data structure that are relevant for #22381.

I believe, the correct approach would be to store a user vertex labelling into an extra attribute (as done in _nlist and _mlist, possibly together with its inverse dictionary) and only call it from there. But do not store the internal digraph according to this labelling.

Main reason:

Alternative reason:

What are the advantages to actually relabel the digraph and walk the _nlist and _mlist around in all methods?

Beside, the following seems to be a bug

self._nlist = list(user_labels.keys())[0:n]
self._mlist = list(user_labels.keys())[n:n+m]

For example,

sage: ClusterQuiver(M,user_labels={"B":0,"A":2,"C":1})
sage: ClusterQuiver(M,user_labels={"B":0,"A":1,"C":2})

coincide and the values of the user_labels are never ever used anywhere, and the keys are used in the computer specific internal ordering.

I would appreciate if you could reconsider these user labellings -- thanks! I will put #22381 on hold until this is solved in one way or another.

CC: @sagetrac-gmoose05 @stumpc5 @sagetrac-benstrasser @Etn40ff @fchapoton @thecaligarmo

Component: combinatorics

Keywords: cluster

Issue created by migration from https://trac.sagemath.org/ticket/24917

5ca01b39-a018-4e16-837c-9d1ad4ceef28 commented 6 years ago
comment:1

You may have found a bug with using user_labels as dictionaries. This part of the original code was written by aram.dermenjian, who I am CC-ing here.

It is quite possible that the current implementation can be optimized, and I have no problems with storing the labels separately per-se. I do believe that it is important to allow users to call ClusterQuiver on a (labeled) DiGraph, and to declare arbitrary vertices of the input DiGraph as "frozen." This is relevant in the code for #19408 (which we should really finish at some point...). Some issues to consider are that, if a given ClusterQuiver is created using frozen vertices, the frozen vertices will not naturally be the final m vertices in the Digraph's list of vertices. Also, we want to ensure that the user can mutate at a vertex in the ClusterQuiver, not just at certain indices. I have no problem with using default labeling for the internal digraph as long as the above functionality remains.

I'll now address your "main reason." We did make an effort to ensure that fast, internal computations (such as mutation_class if I recall correctly) are done on copies of the cluster quiver with default labeling, so many of these operations shouldn't be significantly slower. If there is a particular internal method which concerns you, it might be easier to make your changes in that method. I believe the debate about matrix mutation vs digraph mutation debate predates this ticket. If I am not mistaken, matrix mutations are used in many of the internal methods, but the "mutate" command also does digraph mutations. It may be better to use matrix mutations exclusively in the "mutate" command, and your proposed change could make this easier to implement.

Unfortunately, I will not have time to address this in the foreseeable future. It's fine with me if you want to make these changes. Otherwise, I can keep my ear to the ground to see if anyone else is willing to help out.

stumpc5 commented 6 years ago
comment:2

@bhutz: I am totally fine with creating a cluster seed from a vertex and edge labelled digraph G = (V,E) with frozen being a subset of V declared to be frozen.

But I also believe that this input should be parsed internally to a quick to access and use structure: Provide a list Vs of all vertices such that the first n are mutable and the final m are not mutable and a dict Vs_inv telling which vertex goes to which index. Now, the only place where your input labelling is stored is this list and the dict, and nowhere else. In particular, the _digraph should now be a graph using vertices 0 ... n+m-1.

If you mutate at a vertex v, you first look up v in Vs_inv to see which internal number i it has, and then mutate internally at i.

Important here: the lookup creates an overhead that is currently carried around also for the standard labelling. I wonder how much this is, because it can easily bypassed if you have a flag that tells if the quiver has standard vertex labelling or not.

I was not involved in creating this functionality, nor in its reviewing process. But I would still hope that this overhead is currently negligible.