fujaba / org.fujaba.graphengine

This project's aim is to build a graph engine, that is able to build and compare graphs - and to match patterns and apply actions on the graph, to effectively use it as a graph-transformation framework.
BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

normalization doing things it shouldn't do #31

Closed hoechp closed 7 years ago

hoechp commented 7 years ago

found some bugs related to creating the NF graph:

example graph:

        Graph b = new Graph();
        Node b1 = new Node();
        Node b2 = new Node();
        Node b3 = new Node();
        Node b4 = new Node();
        b1.addEdge("a", b2);
        b1.addEdge("b", b2);
        b3.addEdge("a", b4);
        b3.addEdge("b", b4);
        b.addNode(b1, b2, b3, b4);

before:

"edges":[{"a":[3]},{"b":[3]}]

after:

"edges":[{"a":[3,3,3,3,3,3,3,3]},{"b":[3,3,3,3,3,3,3,3]}]
hoechp commented 7 years ago

normalization seems to have serious bugs and it's almost a wonder it was able to create a correct reachability graph.

edit: the bug was a mapping from serializations of nodesorttrees to the nodesorttrees. having two identical subgraphs within a graph, their nodesorttree-serializations are identical, resulting in all sorts of bugs. this can, as far as I understand, only happen with multiple occurancies of identical subgraphs within the graph (that are not connected to each other).

hoechp commented 7 years ago

Okay, I found a solution, I think. It's pretty brainf*, to think through this - but I identified two problems that are already shown by the two examples we used at the bachelor colloquium. Either having multiple identical sub-graphs that are not connected to each other, or otherwise within a connected (sub-)graph having multiple** sets of nodes that are interchangable with each other - then (in both those cases) the node-sort-trees for multiple nodes are exactly the same, resulting in a basically random sort within those identically represented nodes. though this can lead into different serializations of the complete graph. (additionally this caused further bugs along the way)

The solution:

This will result in a completely different order, but its consistent even for the worst kind of graph. I'm pretty sure at least and I'll implement and test it shortly.

hoechp commented 7 years ago

Okay, this worked. Now the examples from friday's colloquium are running as expected - no known bugs are left. And the performance didn't noticably decrease, either. I will add more hellish tests for this feature shortly - hopefully covering most worst-case situations for this algorithm.

edit: While solving this, I had an idea on how to utilize those nodes, that have other nodes within the graph, that have the exact same 'node-sort-tree' (they are practically 'isomorph' to each other - well, it's a bit more compilcated, though). For example, this can be used in pattern matching - see #32.

hoechp commented 7 years ago

This is a pretty wicked example for this sort of algorithm: 20170305_173632 To make it even harder, this can be included multiple times as sub-graphs that ain't connected to one another.

The implementation now works in principle, but currently it's possible in some situations, that the sorting can get stuck in an infinite loop. Which it does for this example (but not the smaller one from colloquium). I'm pretty sure, this can be fixed, though.

hoechp commented 7 years ago

One solution would be to detect loop cycles and pick a certain sort within the loop cycle, for example the one where the whole sort itself is lexicographically, in respect of the list of serialized node-sort-trees, the lowest. The reason why this can't be done earlier, is that the IDs of the nodes within the node-sort-trees are changing while sorting - if a cycle is reached, though, this can be done. The biggest drawback here is, that even know these cycles are occuring only in very very rare situations, still for cycle detection the whole history of orders must be maintained, which is likely to noticably slow the whole algorithm down.

hoechp commented 7 years ago

Okay, it's now all fixed - and not noticably slower than before.