ComplexNetworks-DCC-FCUP / fase

FaSE - Fast Subgraph Enumeration
7 stars 1 forks source link

The reported sub graphs have their edges inverted. #1

Closed theagilehacker closed 8 years ago

theagilehacker commented 8 years ago

When I compare the output from FaSE to that of gtrieScanner I noticed the reported sub graphs did not match. Looking closer at the output for small test graph, and drawing on a whiteboard that it appears that the edges are being inverted.

So for example the simple graph: 1 2 1 3 1 4

reports: Exact Enumeration, no Sampling done

    Detailed Output:

0000100010001000: 1 occurrences

which is the equivalent of the network: 2 1 3 1 4 1

gangsterveggies commented 8 years ago

The output of FaSE and gtrieScanner isn't necessarily exactly equal, but should represent the same subgraphs. If I understood you correctly (and looking at your example) this is your issue.

What happens is that both FaSE and gtrieScanner use nauty (http://pallini.di.uniroma1.it/) to handle the isomorphism calculations, but they use it differently. What FaSE reports are canonical representations that come directly from nauty and are simply adjacency matrices (maybe I should have made the output more clear, I'll probably add something soon). The gtrieScanner also reports canonical representations that are simply adjacency matrices, but their adjacency matrices can be different (even though they represent the same graph). They are different because: I think the source code of the gtrieScanner uses the old nauty version (but I'm not 100% sure); the gtrieScanner may change the preferred representation in order to compress the underlying gtrie the most.

It is tough to ensure both output the exact same representations, because of the algorithms' requirements, but they should be equivalent isomorphism-wise. Does this match your question or did I totally misunderstood you?

theagilehacker commented 8 years ago

I understand that the adjacency matrix can be different (based on node ordering) however with the simple example I gave the adjacency matrix should have one node with three outgoing edges so the the 1s should be next or very close to each other. I was expecting an output that looked something like:

0111000000000000 or 000010110000 or 0000000011010000 or 0000000000001110

what I see is the transpose of the first one :) which is 0000100010001000.

Then again I am not the expert :)

gangsterveggies commented 8 years ago

OK, that is a different issue, I did misunderstand you then :P

I reran FaSE with the sample graph you provided and I got the following result (for a size 4 enumeration):

0001000100011110: 1 occurrences

Which is the actual graph (0001 / 0001 / 0001 / 1110). How did you ran the code for that example (what was the command line), for both FaSE and gtrieScanner?

theagilehacker commented 8 years ago

Right sorry I forgot to mention I ran with the directed graph option :)

gangsterveggies commented 8 years ago

You are absolutely right! It turns out there was a bug in the code that converted the FaSE internal representation to the nauty one that "swapped" the direction of the edges. I fixed it on commit 7807b70b8d3a867794c579449bc62ee2dfa98043.

Thank you very much for the feedback! That is the kind of thing that after a million times running the code and checking occurrence numbers and whatnot one loses track of, so I would probably only notice that in very specific situations. Nice catch :D

theagilehacker commented 8 years ago

Awesome! :) thank you too for looking into it :). Great work by the way.