Digraph6 file format incompatibility

jukkakohonen commented 5 years ago

Writing a Digraph6 (.d6) file from nauty, and then trying to read it with IteratorFromDigraphFile, fails with the following error.

gap> iter := IteratorFromDigraphFile("test.d6");
Error, Digraphs: DigraphFromDigraph6String: usage,
the input string <s> is not in valid digraph6 format, at /home/local/kohonen/gap-4.9.2/pkg/digraphs-0.12.1/gap/io.gi:827 called from
decoder( line ) at /home/local/kohonen/gap-4.9.2/pkg/digraphs-0.12.1/gap/io.gi:253 called from
file!.coder( file ) at /home/local/kohonen/gap-4.9.2/pkg/digraphs-0.12.1/gap/io.gi:198 called from
<function "IteratorFromDigraphFile">( <arguments> )
 called from read-eval loop at *stdin*:5
type 'quit;' to quit to outer loop

To reproduce, let the file "test.d6" contain (for example) this single line: &O?????C??O?@??C??O?@??C??O?@??C??O?@??C??o??

The problem seems to be this. Nauty (for example, function writed6) writes digraphs so that each line begins with an ampersand (&). Indeed that is how nauty documentation [http://pallini.di.uniroma1.it/nug27.pdf] (page 81) says. However, IteratorFromDigraphFile requires the lines to begin with a plus sign (+).

If I change the & into a + in the file, then the digraph is read in just fine. How did this incompatibility come to be? Did the digraph6 format change at some time?

Versions: GAP 4.9.2 Digraphs 0.12.1

james-d-mitchell commented 5 years ago

Thanks for the report @jukkakohonen, we will investigate and let you know.

mtorpey commented 5 years ago

I think I can explain this. Unfortunately, the "digraph6" format used in this package is different from the "digraph6" format used in Nauty. When Julius and I wrote this code (probably late 2014) we weren't aware of an established format for digraphs, so we invented a new one based on graph6 and just called it digraph6. I don't know whether Nauty's "digraph6" format existed back then, or whether it was documented, but we certainly didn't know about it.

Given this, I'm rather surprised that just changing the + symbol to a & is sufficient to make everything work. When we designed the file format, we wanted it to be fairly simple to understand, but it would still be an astonishing (and delightful) coincidence if we just happened to invent the exact same format save for a different indicator symbol. If these formats really are so close to each other, then I suggest we should allow either initial symbol when reading, and change to writing & to maximise compatibility. But if there are other subtle differences, then we may have to do something more complicated.

I'll investigate further this afternoon.

wilfwilson commented 5 years ago

Thanks for this @mtorpey. Unfortunately there are at least some non-trivial differences between the formats; they are not equal. From McKay's description of the digraph6 format:

   Suppose n=5 and G has edges 0->2, 0->4, 3->1 and 3->4.

   x = 00101 00000 00000 01001 00000

   Then N(n) = 68 and
   R(x) = R(00101 00000 00000 01001 00000) = 73  63  65  79  63.
   So, the graph is  38 68 73  63  65  79  63.

38 68 73 63 65 79 63 in decimal is &DI?AO?, which replacing the & by + is +DI?AO?

gap> OutNeighbours(DigraphFromDigraph6String("+DI?AO?"));
[ [  ], [ 4 ], [ 1 ], [  ], [ 1, 4 ] ]

This is not the digraph described in the example. However, it seems to be the reverse:

gap> InNeighbours(DigraphFromDigraph6String("+DI?AO?"));
[ [ 3, 5 ], [  ], [  ], [ 2, 5 ], [  ] ]

(Not that Digraphs uses 1-indexing, not 0-indexing). Perhaps your format works the same as Nauty's, except that the edges are reversed? This would also be a bit of a coincidence, but it would be worth establishing whether that is the case.

james-d-mitchell commented 5 years ago

I'm not sure if this is related but recently I tried inputting the graph at:

https://math.stackexchange.com/questions/1561029/a-triangle-free-6-chromatic-graph-with-44-vertices

which claims to be in graph6 format but gives the error:

Error, Digraphs: DigraphFromGraph6String: usage,
the input string <s> is not in valid graph6 format, at /Users/jdm/gap/pkg/digraphs/gap/io.gi:780 called from
<function "unknown">( <arguments> )
 called from read-eval loop at *stdin*:48
type 'quit;' to quit to outer loop

when I try to read it into Digraphs.

mtorpey commented 5 years ago

I've now investigated, and it's as Wilf suspected. Brendan McKay's specification for digraph6 is just the reverse of ours: he reads across the rows of the adjacency matrix, whereas we read down the columns of it. His way is more natural, whereas ours is closer to what graph6 does. Apart from this one change, it's exactly the same algorithm – and it's a case of modifying one line of code to make ours behave like theirs.

I'll make a pull request shortly to resolve this. My plan is:

Modify the writing function (Digraph6String) to write digraphs in the Nauty format, by reading across rows instead of down columns, and using & instead of +.
Modify the reading function (DigraphFromDigraph6String) to accept digraphs in the Nauty format, while also accepting our present format for the sake of backwards compatibility. This can be recognised easily due to the strings being labelled with + and &, and shouldn't slow things down at all or require any duplication of code.
Check anywhere strings are automatically recognised (e.g. ReadDigraphs), and make sure they recognise & symbols properly.
Add an info warning alerting users when they read a + string, so that they know the format is obselete and incompatible with Nauty.
Contact Brendan McKay to make him aware of the situation, and draw his attention to the existence of our disparse6 format, which so far doesn't seem to have an equivalent in Nauty.

If anyone's interested, Brendan McKay's specification is actually the document we used when we were implementing graph6 and sparse6, and there was nothing there about digraph6 when we were doing so. So his documentation for digraph6 is actually newer than our implementation in Digraphs, though probably only by a few months (it was last updated in June 2015).

mtorpey commented 5 years ago

(I'm pretty sure the issue @james-d-mitchell raises is unrelated, but I'll check that too)

james-d-mitchell commented 5 years ago

Take a look at:

https://doc.sagemath.org/html/en/reference/graphs/sage/graphs/digraph.html

They are using digraph6 format too, might want to check if they have disparse6 or whatever too.

wilfwilson commented 5 years ago

Thanks for investigating @mtorpey! Sounds good.

jukkakohonen commented 5 years ago

Thanks everybody! This was an interesting chain of coincidences, starting from both parties choosing exactly the same name "digraph6" independently and almost simultaneously, and then choosing almost the same format. The plan seems good.

wilfwilson commented 5 years ago

This is resolved by #162, thanks @mtorpey.

mtorpey commented 5 years ago

I think the main bug is resolved, but I still need to contact Brendan McKay, which I might do on Wednesday afternoon after documenting disparse6 a little better.

wilfwilson commented 5 years ago

Remember to contact Brendan please @mtorpey, if you haven't already. This is now resolved in the newly released v0.15.0

mtorpey commented 5 years ago

Thanks for the reminder – my plan is to do this next Wednesday.

digraphs / Digraphs

Digraph6 file format incompatibility #158