pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
228 stars 24 forks source link

[merged] Check Isomorphism #73

Closed MattesWhite closed 4 years ago

MattesWhite commented 4 years ago

This PR adds a function to check whether two Graphs are isomorphic equivalent regarding blank nodes.

Supports: #72

This more or less resembles the algorithm used in oxigraph by @Tpt. However, I had to adapt it to sophia's API.

The algorithm

To check if two RDF graphs are equal their triples must be compared. However, blank nodes are local to the scope of their graphs. Accordingly, the labels of blank nodes does not matter when we compare two graphs only that they refer to the same thing.

The idea behind the implementation is that for each BN a distinct hash is calculated. if we do this for each BN in the two graphs a mapping from BNs of the first graph to the BNs of the second graph can be derived. If the mapping is applied to the BNs of the first graph the result should be absolute equal to the second graph.

To calculate the distinct hashes for a BN we calculate the hash of the hashes of each triple that contains the BN at any position. To eliminate the dependency from BN labels, BNs are excluded from the hashing.

Should the hashes of BNs collide within a graph the hashes for those BNs are improved. This is done by hashing adjacent triples in the graph, i.e. we traverse further from subjects and objects and include the traversed triples in the hash. Iteratively the hashes are improved until no collisions are left. (In my implementation the traversal is actually limited to 8 steps).

ToDo

Future work

The code should become easier when we introduce a term-trait.


@pchampin sadly I had to add some utility function are those okay?

@Tpt could you verify that my version of the algorithm is correct?

MattesWhite commented 4 years ago

@Tpt I detected an issue in the algorithm when a graph includes redundant blank nodes. The problem is closer described in the function's documentation.

MattesWhite commented 4 years ago

With the latest commit this PR is complete and ready to be merged.

MattesWhite commented 4 years ago

Does anyone know/understand why Travis CI stuck?

pchampin commented 4 years ago

Does anyone know/understand why Travis CI stuck?

Can you elaborate?

Note that I don't have a strong feeling between Travis and Github CI. It's just that I'm no expert in either one.

MattesWhite commented 4 years ago

Well if I look into Travis it states that the check started about 4 days ago and is still queued. But no particular test is being executed so far. However, I ran the tests, clippy and fmt locally so non of them should fail.

Submitting another commit just to restart Travis wrong seems to me.

pchampin commented 4 years ago

Well if I look into Travis (...)

this is actually Github's rendering of Travis's report. If you click on the link at the bottom, you are lead there which shows that all tests have passed 4 days ago.

So it seems to be a glitch on Github's side. Could this be intentional to lead us to migrate to GH CI? :smiling_imp:

MattesWhite commented 4 years ago

As the PR is already merged and only Travis is still doing silly things I clean up by closing this PR.