sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.44k stars 481 forks source link

Nauty interface for isomorphism checking and automorphism group computing #25506

Open 3b6bf08f-4145-4a24-9d2b-1bd9845ff758 opened 6 years ago

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago

I used the pynauty library (https://web.cs.dal.ca/~peter/software/pynauty/html/intro.html#), which has GPLv3+ license by the way, to interface with our pynauty package and use it to check for isomorphism and compute automorphisms through the standard sage methods. In particular, I am actually copying the package from upstream, patching it and recompiling it, since pynauty wrappers require to be compiled with nauty's own source code. A lot of modifications where required to make the transition from sage's own graph representation to one that nauty was able to understand. At the moment this interface doesn't support arbitrary edge labels, but it supports multi-edge graphs through an helper function that converts them in equivalent single-edge graphs (http://pallini.di.uniroma1.it/Guide.html section 14) that I put in sage.graphs.multiedge for everyone to use. The efficiency is not exactly spectacular, since the graph needs to be relabeled with labels going from 0 to n-1 (where n is the number of vertices), and thus a copy needs to be made and the relabel function needs to be called. If, by chance, the graph is already labelled in that way, there is an option to tell the interface not to relabel the graph. If the relabeling is found to be too inefficient, it could be implemented in the C++ part of the interface, but then a lot of the conversions done to return sensible results (i.e. return automorphism group generators listing the actual nodes' labels, and not arbitrary labels in {0..n-1}) would have to be moved too. Modifications were made to also output results in the exact same format of the standard sage methods.

Last, I noted that a version of the same algorithm I used to convert multi-edged graphs to their single edge version was implemented in the Bliss module (but only for Bliss graphs), in #24924; it might be worth it to change it to use this more general version, if needs allow it.

CC: @dimpase @stumpc5 @dcoudert

Component: graph theory

Keywords: graphs nauty pynauty automorphism isomorphism, gsoc2018

Branch/Commit: u/Vaush/nauty_interface_automorphism_isomorphism @ a907f60

Issue created by migration from https://trac.sagemath.org/ticket/25506

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago

Branch: u/Vaush/nauty_interface_automorphism_isomorphism

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago

Commit: 80a9212

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:2

Notice that it still lacks properly formatted comments, so it's still not ready for review


New commits:

4c1223eFixed python 3.6.1 crypt issue on Fedora 28
b20fd48Merge remote-tracking branch 'trac/u/cpernet/fflas_and_linbox_broken_with_gcc_8_1_0' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
473f214Merge remote-tracking branch 'trac/u/jdemeyer/upgrade_to_python_2_7_15' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
5be7adcMerge branch 'master' into t/25391/sagemath_fails_to_build_on_on_fedora_28_with_gcc_8_1
d1e2f76pyNauty interface with sagemath completed. TODO: actually use pynauty.isomorphic in is_isomorphic, and document everything
80a9212Interface integrated with the standard sage methods for automorphism groups and isomorphism checking
dimpase commented 6 years ago

Changed keywords from graphs nauty pynauty automorphism isomorphism to graphs nauty pynauty automorphism isomorphism, gsoc2018

dcoudert commented 6 years ago
comment:4

Thanks for the contribution.

I'm not expert in nauty or creating interfaces, but instead of creating a new file multiedge_singledge_conversion.pyx (which is a weird name), you should create a file nauty.pyx for the interface with nauty, or directly put this method into sage_graph.patch. Furthermore, why are you adding from .multiedge import multiedge_to_singledge to src/sage/graphs/all.py ? It's a very specific method, so I don't see why it should be added to global namespace.

Concerning the method itself, you don't need to convert the graph to a (slow and space consuming) dictionary. You can access the list of multiple edges of the graph with G.multiple_edges(). Furthermore, if the graph has multiple edges, you can get the list of distinct labels using G.edge_label(0, 1)

sage: G = Graph(multiedges=True)
sage: G.add_edges([(0, 1, 0), (0, 1, 2), (1, 2, 1), (2, 3, 1), (2, 3, 2)])
sage: G.multiple_edges()
[(0, 1, 0), (0, 1, 2), (2, 3, 1), (2, 3, 2)]
sage: G.edge_label(0, 1)
[2, 0]
sage: G.edge_label(1, 2)
[1]

Another issue is that you assume edge labels to be integers, but we can have floats, rationals, list, dict, etc.

It's clearly easier to use pynauty, but it's a petty we cannot convert the graph directly to the data structure used by nauty.

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:5

Thanks for your input.

Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.

Also, thanks for the suggestion on the method. I don't recall why I chose to use a dictionary, probably because I didn't think to use the multiple_edges -> edge_label combo, and I wanted to be able to quickly find the multiplicity of any edge in the graph. I'll rework that part to avoid transforming the graph into a dictionary.

About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.

I don't think I ever assume edge labels to be integers, could you show me where I did that? It's probably an error or bug if I did. Actually, that's the reason why I said I can't use edge_labels at the moment, because I'm trying to see if substituting edge labels with integers (to use the technique suggested in section 14 of the nauty guide) would cause any issues.

Lastly, while I started with converting the graph in pynauty format and then letting pynauty do its thing, in this version the sage graph is actually converted directly into nauty format (but this is done for every call to automorphism_group or is_isomorphic, so there's that). Nonetheless, the main skeleton of the module, all the code that manages calling nauty and setting up data structures, etc. is still pynauty's, I just switched out its graph format for sage's and set the output format accordingly, so that's why I still call it pynauty, and it's still based on its source code.

If you have any other suggestions or issues, please tell me, I'll be working on the problems you highlighted in the meantime.

dcoudert commented 6 years ago
comment:6

Replying to @sagetrac-Vaush:

Thanks for your input.

Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.

Yes, do it.

About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.

I don't think I ever assume edge labels to be integers, could you show me where I did that? I

You call get_bits_list(label) whic itself calls bin(x). So if you give anything else than an integer, you get an error.

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:7

Replying to @dcoudert:

Replying to @sagetrac-Vaush:

Thanks for your input.

Regarding the file with the multiedge conversion, I admit the name is not the best, but I didn't put it inside the nauty interface because it's a general method, it converts a sage multigraph in a suitable sage graph with only single edges, following the method outlined in the provided link. I was thinking of moving it inside generic_graph.pyx, but didn't want to take the decision all on my own. If you think it's better, I'll gladly move it though.

Yes, do it.

About the import, it's mostly due to experiments I've been running to understand how the importing works in sagemath. Without that line (or any line about pynauty) in all.py, I had to manually import the module in sagemath, but even with that line I wasn't able to limit what sage imports, getting instead the entire list of all the objects inside the module. If the method is moved in generic_graph.pyx I'll have to remove the entry anyway.

I don't think I ever assume edge labels to be integers, could you show me where I did that? I

You call get_bits_list(label) whic itself calls bin(x). So if you give anything else than an integer, you get an error.

I see how that might be misleading, but I think I only call that function on a dictionary that I created, that associates each edge in the graph with its multiplicity, so I am sure that "label" is an int because that's the multiplicity of the edge

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 80a9212 to 5f6a20a

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

5f6a20aMoved function multiedge_to_singledge to the GenericGraph class, changing its name in to_singledge and making it so it doesn't convert the graph into a dictionary anymore
dcoudert commented 6 years ago
comment:9

Could you explain what's the goal of this method and what it does. You will have to add a description of the method anyway. Also, shouldn't original edge labels be added somewhere ?

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:10

Replying to @dcoudert:

Could you explain what's the goal of this method and what it does. You will have to add a description of the method anyway. Also, shouldn't original edge labels be added somewhere ?

So the entire goal of this method is to account for the impossibility of managing multiple edges (and in the future, edge labels) in a graph with nauty. What it does is construct an auxiliary graph with n log k vertices, where k is the maximum multiplicity of the edges in the original graph, that has no multiple edges and with two peculiar characteristics: -There is a subset of its vertices (returned by the method) such that the auxiliary graph's automorphism group's action on them is the same as the one on the original graph (that is, if one keeps every vertex not in the subset in a separate partition, and then removes every mention of these additional vertices from the automorphism group of the auxiliary graph, the resulting automorphism group is equal to the original graph's automorphism group -Given two (multi)graphs, they are isomorphic if and only if their auxiliary graphs are isomorphic

So right now it works with multiplicities, I am trying to find a way to include arbitrary edge labels, but since it's an auxiliary graph I didn't see the point in adding the edge labels. I can do it, but they would just become a nice ornament, not really used for anything (at least not in nauty).

dimpase commented 6 years ago
comment:11

As far as I understand, it's more or less the code written by stumpc5 - although for bliss rather than for nauty, and the idea is to make it more general, so that it can be used for nauty as well as for bliss as "backends" to compute automorphisms etc.

dcoudert commented 6 years ago
comment:12

Right, but with bliss we are at Cython/C level so it might be faster. We'll see if the performances are OK.

dimpase commented 6 years ago
comment:13

I don't really know how pynauty, the interface used, works (Dario, the ticket's author, knows better). My limited understanding that it's also an interface on the level of C, thus it should be fast.

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:14

Actually, while the conversion to a nauty graph and the computation is in c, the relabeling and eventual conversion of the multigraph are done in python. I could easily move the relabeling to C++, the multigraph part would be a little more difficult, but still doable. I don't know if this would be that much of an improvement in performance though, that's why I didn't do it in the first place.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 5f6a20a to a05864f

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

58da594Added documentation and doctests
a05864fFixing exception test bug
3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:16

So, I commented and added doctests. I run into a quite nasty bug though, which I couldn't really pinpoint or fix, I could just find a workaround. I submitted two commits, the first adds the docs, the second "fixes" the bug, so you can check for yourself. What happens is that, when running tests only, after any test checking for exception raising (such as the test in automorphism_group() that checks a KeyError is returned if a wrong partition is passed as a parameter), any call to to_singledge() ends in a segfault, caused by a call to PyObject_CallMethodObjArgs(...). Again, the segfault is raised only in this specific circumstances. I couldn't find a way to explain the error, but I could fix it by substituting that call with a call to PyObject_CallMethod(...). It's not necessary to make the code work, but if anyone has any insight on the cause (or on how to run gdb interactively on the c code of a third party package during a sage testing session), I'd be happy to hear it. Let me know if there's anything wrong with the docs and doctests, since this is the first time I write them.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from a05864f to 6021723

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

f5567acRemoved the need to modify the upstream tarball
beb64e9PyNauty tests should be optional
a1c6572The nauty interface automorphism (and isomorphism, but it's not tested yet) now allows arbitrary edge labels AND multiple edges, even at the same time, though differing edge labels on one multi edge gives flaky results
6648f64Partial changes
6021723Fixed several bugs, added doctests and modified documentation
3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:18

So, dimpase made me notice that the code I wrote had a few issues for people who don't actually have pynauty installed or even downloaded. I modified the code so that there's no need to modify the upstream tarball anymore, thus it can simply be downloaded from https://web.cs.dal.ca/~peter/software/pynauty/pynauty-0.6.0.tar.gz and put in the upstream folder; still, could this tarball be added in the repositories, so that it can be downloaded automatically by sage? After this, I also made the tests about pynauty optional, since I forgot the first time around. Finally, now this interface supports both multiple edges and labelled edges, even at the same time (but having a multiple edge with different labels can produce weird results). Right now there should not be any more issues with this branch, so I'll move on to another contribution; I'll leave it like this for a couple of days, so if anyone spots any errors I can fix them, then I'll ask for review, I hope.

dimpase commented 6 years ago
comment:19

the tarball will be mirrored once the ticket id merged.

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago

Dependencies: #25391

dimpase commented 6 years ago
comment:21

Replying to @sagetrac-Vaush:

Why do you need this dependence? This is potentially misleading, and won't get this ticket in before #25391. This is certainly not what we want.

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:22

I see, I added it because on my system I needed #25391 to compile sage, so the branch is initially merged with the one from that ticket.

Since the branches are merged, and since I can't have them not merged because otherwise sage won't compile at all for me, I thought I had to list it as a dependency. Is there another way to go about it?

dimpase commented 6 years ago
comment:23

Replying to @sagetrac-Vaush:

I see, I added it because on my system I needed #25391 to compile sage, so the branch is initially merged with the one from that ticket.

Since the branches are merged, and since I can't have them not merged because otherwise sage won't compile at all for me, I thought I had to list it as a dependency. Is there another way to go about it?

Sure, just don't merge these branches on trac, only merge them on your machine.

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

79488e5pyNauty interface with sagemath completed. TODO: actually use pynauty.isomorphic in is_isomorphic, and document everything
dcb29f0Interface integrated with the standard sage methods for automorphism groups and isomorphism checking
f2cd759Moved function multiedge_to_singledge to the GenericGraph class, changing its name in to_singledge and making it so it doesn't convert the graph into a dictionary anymore
6d18761Added documentation and doctests
052f668Fixing exception test bug
f825522Removed the need to modify the upstream tarball
7c40f4aPyNauty tests should be optional
56defebThe nauty interface automorphism (and isomorphism, but it's not tested yet) now allows arbitrary edge labels AND multiple edges, even at the same time, though differing edge labels on one multi edge gives flaky results
76fb654Partial changes
bf5148bFixed several bugs, added doctests and modified documentation
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 6021723 to bf5148b

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:25

Ok, I just deleted the old branch, and created a new one that doesn't merge with ticket #25391.

If anyone has checked out the branch previously, could you please delete your copy and check it out again.

dimpase commented 6 years ago
comment:26

What exactly the need for that mega-patch of build/pkgs/pynauty/patches/sage_graph.patch ? What does "recompile with nauty source code" mean?

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago

Changed dependencies from #25391 to none

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:28

Replying to @dimpase:

What exactly the need for that mega-patch of build/pkgs/pynauty/patches/sage_graph.patch ? What does "recompile with nauty source code" mean?

The mega patch is my work, basically all the changes I made to pynauty so that it could interact with sage graphs directly, use multigraphs and labelled graphs, output automorphism groups in a decent format, etc. Without it the package would be exactly like the one downloaded from upstream, which is good, but not really useful for Sage. About the nauty source code part, basically the way pynauty was developed by its original creator is that it compiles its own version of nauty (and probably uses it internally, i'm not really sure), so it needs nauty's source code instead of already compiled executables; to do this, I take the source code out of Sage's nauty and use that to compile pynauty.

dcoudert commented 6 years ago
comment:29

I have the following comments on method remove_labels:

for el in edge_labels:
    if not el in edge_labels_dict:
        edge_labels_dict[el] = bin(counter)[2:][::-1]
        counter += 1
for u,v,label in G.edge_iterator():
    for idx, bit in enumerate(edge_labels_dict[label]):
        if bit == '1':
            newG.add_edge((u, idx), (v, idx))

In fact we use:

sage: list(enumerate("string"))
[(0, 's'), (1, 't'), (2, 'r'), (3, 'i'), (4, 'n'), (5, 'g')]
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

472d319Small code refactoring
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from bf5148b to 472d319

3b6bf08f-4145-4a24-9d2b-1bd9845ff758 commented 6 years ago
comment:31

Replying to @dcoudert:

I have the following comments on method remove_labels:

  • remove import of floor. Not used.

  • remove import of version_info and method getiterator that you don't use.

  • in the for loop to fill edge_labels_dict, I suggest to change it with

for el in edge_labels:
    if not el in edge_labels_dict:
        edge_labels_dict[el] = bin(counter)[2:][::-1]
        counter += 1
  • then remove method get_bits_list
  • rewrite
for u,v,label in G.edge_iterator():
    for idx, bit in enumerate(edge_labels_dict[label]):
        if bit == '1':
            newG.add_edge((u, idx), (v, idx))

In fact we use:

sage: list(enumerate("string"))
[(0, 's'), (1, 't'), (2, 'r'), (3, 'i'), (4, 'n'), (5, 'g')]
  • Do you really need to set loops=True ? or can't you just do loops=self.allows_loops() to allow loops only if self allows loops ?

Sure, I don't see any particular upside to making this changes, but if you think they're for the better, I don't see any downsides either, so I made them. The only one I kept out for the moment is the loops part, since I don't remember why precisely I set loops always to true, there might be some other reason not immediately obvious that I'm forgetting. I'll come back to that too in a few days.

dimpase commented 6 years ago
comment:34

log() is not in the global namespace. One needs to add

--- a/src/sage/graphs/generic_graph.py
+++ b/src/sage/graphs/generic_graph.py
@@ -1828,6 +1828,7 @@ class GenericGraph(GenericGraph_pyx):
         Algorithm as described in section 14 of Nauty's guide (version 2.6),
         http://pallini.di.uniroma1.it/Guide.html
         """
+        from sage.functions.log import log
         self_edge_labels = self.edge_labels()
         if not [el for el in self_edge_labels if el != None]:
             return self, []

(without it, tests fail in generic_graph.py)

7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Branch pushed to git repo; I updated commit sha1. New commits:

a907f60Added log import
7ed8c4ca-6d56-4ae9-953a-41e42b4ed313 commented 6 years ago

Changed commit from 472d319 to a907f60

fchapoton commented 4 years ago
comment:37

red branch => needs work