Integrate HermiT reasoner

gaurav commented 7 years ago

Added a new Java-based command line reasoner that writes out a set of ntriples (n3) linking individuals within a phylogeny (expressed in OWL as RDF/XML) matching phyloreferences (written in OWL Manchester syntax as a proof of concept, but I can change that to Turtle pretty easily), as well as a Python script to test this by comparing the matched nodes with those in another OWL class. If this works, it should be possible to add new phyloreferences against this particular phylogeny just by adding new classes to pg_2357.phylorefs.omn.

gaurav commented 7 years ago

@hlapp We could, but reasoner.jar does a bunch of weird things:

It combines two input files: the one containing the phylogeny as well as the one containing the phyloreferences. This makes sense to me, since in my head they'll eventually come from different sources (phylogenies from Open Tree of Life or user input, and phyloreferences from PhyloRegnum.org). Of course, there are other ways of combining them, such as by inserting the phyloreferences into the RDF/XML file; but I think the way reasoner.jar does it --- by loading them from different files --- is pretty elegant but very specific to what we're doing.
The information it currently produces --- class membership --- is only a small part of all the inferences it generates. This was what we were interested in for phyloreferences, but we could include other information as well, which might be worth testing with SHACL (e.g. do any inferred has_Descendant relationships end up forming a loop, say).
1. It returns (some) inferred triples, but there are other ways to return that information -- as a list of individuals belonging to each class, say. If we were to rewrite our test suite in Java, then we wouldn't have to move the inferred triples around, either from here or from testShacl.jar -- both could be returned as OWLOntology objects or Graphs that the rest of the test suite could test directly.
2. It produces a list of triples containing information on class membership for each individual in the ontology, but libreasoner.py actually validates them by comparing the list of individuals in class X with those in class X_expected --- any difference is counted as a failure of this test. If we split reasoner.jar out, would that include this functionality or should that be part of our test suite?

So, I think there's three possible things we could do to spin off reasoner.jar into its own thing:

As a software tool to read one or more ontologies, reason over them, and produce class membership as a set of triples --- pretty much exactly what it does now, but with the bit about combining phyloreferences separated out into its own thing.
As a testing tool to read one or more ontologies, reason over them, and then validate class membership by comparing individuals belonging to class X with class X_expected --- output will be empty if the test passed, but will report differences in classes by output. We can then use this testing tool in our test suite!
As a phyloreferencing tool that reads in a phylogeny as RDF/XML (or maybe even as Phylip/Nexus/NeXML file?) and phyloreferences as a ontology, reasons over them, and then converts the sets of nodes matched by each phyloreference back into individual phylogenies for output. This would package all the phyloreferencing reasoner components into a "black box" that takes in phylogenies and OWL at one end and outputs phylogenies at the other. This would duplicate a lot of our eventual websites' functionality, however, so is probably overkill for the test suite.

I'm probably overthinking this, so option 1 is probably the best option, unless you think the OWL class membership validation stuff is worthy of becoming its own release, in which case option 2 is probably best. If you'd like me to write this up as a blog post, let me know!

hlapp commented 7 years ago

I am going to merge this, but I think between my and your points there is both evidence and merit for something that can be a more generally applicable tool. Enough so that I'm not even sure that (such as for combining a number of ontologies and emitting a pre-reasoned transitive closure) doesn't exist already in some form; for example I believe @balhoff has written a similar tool for Phenoscape or FEED.

balhoff commented 7 years ago

@hlapp @gaurav you may be interested in ROBOT if you haven't tried it already. It is superseding use of owltools for various release processing jobs for OBO ontologies.

gaurav commented 7 years ago

@balhoff Oh wow, thanks SO much -- I think https://github.com/ontodev/robot/blob/master/examples/README.md#reasoning may be exactly what we're looking for!

phyloref / phylo2owl

Integrate HermiT reasoner #17