ontodev / robot

ROBOT is an OBO Tool
http://robot.obolibrary.org
BSD 3-Clause "New" or "Revised" License
258 stars 70 forks source link

Implement a cycle check over relation graph #1204

Open cmungall opened 2 weeks ago

cmungall commented 2 weeks ago

Background:

Basic modules must be DAGs; see:

https://oboacademy.github.io/obook/explanation/owl-format-variants/

Basic: A variant of Simple, in that it is reduced to only a specific set of relations, such as subClassOf and partOf. Some users that require the ontology to correspond to acyclic graphs, or deliberately want to focus only on a set of core relations, will want to use this variant, see docs). The formal definition of the basic variant can be found here.

Currently we have no check for this. Note this check should be over the whole graph over the restricted set of relations. This is incredibly important for ontologies like GO, Uberon, CL, ENVO. Users lose confidence when we release versions that are non-DAGs (most downstream users use simple non-OWL packages like networkx for this, or they have custom simple graph traversal)

(Of course, if there are no existentials then we get this check for free in that ROBOT already has a check to ensure no two classes are mutually equivalent, any cycles of SubClassOf between named classes entails an equivalence clique)

@balhoff proposed doing this with Souffle

This would take the pressure off doing this in robot and if it Souffle is in ODK maybe this is fine, but it may be overweight for what is a simple graph check

If we were to implement in ROBOT, how would we do it. We already have relation-graph, so I believe it should be a simple ?x ?p ?x check where ?p is either owl:subClassOfor a basic relation.

Another option would be to do in OAK, which is optimized for graph operations.

For reference, the corresponding owltools invocation is:

owltools my.obo --list-cycles --fail-on-cycle

balhoff commented 2 weeks ago

This is better than the one in the Uberon PR: https://github.com/geneontology/go-ontology/pull/28186/files#diff-2df631a546eac0a6b1aed7315dae51f10781001eebf4d0a7815a0595414c1e73

The first version has a flaw that gets into infinite paths when terms have mutual cycles.