owlcs / owlapi

OWL API main repository
825 stars 314 forks source link

OWLReasoner interface design #323

Closed mwm314 closed 9 years ago

mwm314 commented 9 years ago

This may not actually be an issue, but more of a question. Is there a reason why the DAGs that represent the respective hierarchies have the following types associated with them: OWLClass OWLDataProperty OWLObjectPropertyExpression

I could understand if it was: OWLClass OWLDataProperty OWLObjectProperty

or even: OWLClassExpression OWLDataProperty OWLObjectPropertyExpression

Why shouldn't OWLObjectPropertyExpression be OWLObjectProperty, or if I have it backwards, why shouldn't OWLClass be OWLClassExpression? I'm not sure why one would be an expression and the other one not an expression. Is this a small semantic error, or am I missing the bigger picture?

On a simlar note, why would the getDisjointClasses() methods return NodeSets, and not just Nodes?

Thanks in advance for the clarification! Also, I'm new to github, so if clarification questions like these are not Kosher, please let me know. However, I felt as if it was worth asking in case there was some sort of issue/error that could be fixed in the future.

ignazio1977 commented 9 years ago

I believe the use of OWLObjectPropertyExpression is historical, and the actual values returned are always OWLObjectProperty instances. We might tidy it up but changing the OWLReasoner interface takes a lot of thought, given the number of independent third parties which implement it.

About getDisjointClasses, it returns just one NodeSet - I think that's a typo in your post. Back when it was designed, it was felt that the convenience methods on that class were more useful than the simplicity of returning a collection of Nodes (That's my guess, I wasn't around at the time). It's a good question whether this design should be updated. Some minor changes are going in for version 5, to allow for streams to be used in places. With Java 8 default methods, we are in a better position to improve the design without breaking compatibility - if you wish, have a look at version 5 and feel free to point out which bits are annoying. It's great to have opinions from outside the developers team, as we are biased towards our own use of the api.

About questions, this tracker is perfectly adequate for them, I'll just add the corresponding label.

sesuncedu commented 9 years ago

Answered questions could also be copied to the wiki (or to answers.semanticweb.com ? ).

OWLObjectPropertyExpression can be property or ObjectInverseOf(property). Some reasoner functions simplify inverses.

Now, DataInverseOf would add some symmetry :)

Related: (and to be moved to a separate issue)

I'm trying to 'wireframe' a bunch of different possible interface designs, then try and kick off a debate on the public lists. I'm tempted to set up a jekyll devblog.

The main apis are the consumer side, the implementation internals, and the plug-ins (reasoners and parsers are somewhat special, since the rest of the impl side aren't commonly redone- protege being the main one I can think of).

I am tempted to run a survey since I'm my own IRB (though I'd ask some IRB panelists to review critically).

It would be great to get feedback from people who have just finished teaching or being taught classes that used the API to find out what was annoying / difficult to understand. Of course, it's the people who don't use OWL or the OWLAPI who may be the most important to reach.

APIs in other languages may be suggestive (comments @phillord @cmungall )?

cmungall commented 9 years ago

I don't have any major issues with the reasoner API

For pedagogic purposes, there is the larger issue of what can and can't be done with owl reasoning, and with the reasoner API. For example, those from a bioinformatics background typically want to ask questions like 'what are the part-of ancestors of hippocampus', which translates to sparqlydl

 SELECT ?y WHERE hippocampus SubClassOf part_of some ?y

Which is not directly answerable, e.g. you have to materialize all potential class expressions first.

@hdietze is about to push an extender reasoner interface to github that allows for answering some questions of this form, Jim @balhoff has nice scala code for the same https://github.com/phenoscape/scowl

sesuncedu commented 9 years ago

The larger issues are the bigger ones- the whole "what do users want to be able to say, what do they want to be able to ask, and what is their background mental model" thing (DLs are quite unnatural, even to people familiar with (non-DL) logics.

Going the other way, a reasoner might want to be able to indicate how expensive it expects an operation to be, (or how big an impact an axiom might have- eg disjuncticion disfunction, or an axiom that blocks the use of a particular optimization).

Similarly, some reasoners allow different forms of *NA, and different kinds of complete knowledge assertions or assumptions. It would be nice to have standard ways of handling this (some reasoners use annotations to pass in this kind of control information, which seems a bit dodgy).

[One of the things Cyc did well was having a bunch of different reasoners that had to bid for the right to try solving a query.] On Dec 10, 2014 12:08 PM, "Chris Mungall" notifications@github.com wrote:

I don't have any major issues with the reasoner API

For pedagogic purposes, there is the larger issue of what can and can't be done with owl reasoning, and with the reasoner API. For example, those from a bioinformatics background typically want to ask questions like 'what are the part-of ancestors of hippocampus', which translates to sparqlydl

SELECT ?y WHERE hippocampus SubClassOf part_of some ?y

Which is not directly answerable, e.g. you have to materialize all potential class expressions first.

@hdietze https://github.com/hdietze is about to push an extender reasoner interface to github that allows for answering some questions of this form, Jim @balhoff https://github.com/balhoff has nice scala code for the same https://github.com/phenoscape/scowl

— Reply to this email directly or view it on GitHub https://github.com/owlcs/owlapi/issues/323#issuecomment-66485967.

matthewhorridge commented 9 years ago

I believe the use of OWLObjectPropertyExpression is historical, and the actual values returned are always OWLObjectProperty instances.

This is definitely by design and definitely intended. They used to be just OWLObjectProperty objects but were actually changed to OWLObjectPropertyExpression objects after several conversations with some of the main reasoner developers. I will dig out these conversations to remember the fine details, but there are good reasons for this. Also, the reasoners do return inverses (well they ought to).

matthewhorridge commented 9 years ago

Which is not directly answerable, e.g. you have to materialize all potential class expressions first.

@cmungall If I understand what you're after, this requires some extra code to perform some entailment checks, but you can optimise this and it shouldn't require you to materialise things.

Take a look at this:

https://github.com/protegeproject/existentialquery/tree/master/src/main/java/uk/ac/manchester/cs/owl/existentialquery

ignazio1977 commented 9 years ago

I believe the use of OWLObjectPropertyExpression is historical, and the actual values returned are always OWLObjectProperty instances.

This is definitely by design and definitely intended. They used to be just OWLObjectProperty objects but were actually changed to OWLObjectPropertyExpression objects after several conversations with some of the main reasoner developers. I will dig out these conversations to remember the fine details, but there are good reasons for this. Also, the reasoners do return inverses (well they ought to).

Cool, cheers for clarifying.

ignazio1977 commented 9 years ago

Going the other way, a reasoner might want to be able to indicate how expensive it expects an operation to be, (or how big an impact an axiom might have- eg disjuncticion disfunction, or an axiom that blocks the use of a particular optimization).

Similarly, some reasoners allow different forms of *NA, and different kinds of complete knowledge assertions or assumptions. It would be nice to have standard ways of handling this (some reasoners use annotations to pass in this kind of control information, which seems a bit dodgy).

Good ideas here. Sounds like you're describing some sort of metadata for which different reasoners would have different values for different ontologies (or a N/A sort of value, if the reasoner can't tell what the cost might be or does not know about a certain dimension).

Regarding @cmungall's 'what are the part-of ancestors of hippocampus', there is an extension of OWLReasoner (OWLKnowledgeExplorerReasoner) which might prove useful, although I've not tried it on these use cases. It's currently implemented by FaCT++ and JFact only. I'm sure we can work in some improvements.

There's another issue open about an extension to OWLReasoner to be able to start streaming results out of the reasoner, instead of the current approach, which is binary: "get me all the answers within this timeout, or don't bother". In this case, rather than an explicit cost model offered by the reasoner, the client application might simply choose to start streaming results out for a certain time, then gently tap the reasoner on the shoulder and say "Thank you that's enough". The current interrupt mechanism is not gentle, as it does not leave the reasoner in a viable state.

cmungall commented 9 years ago

@matthewhorridge cool, this looks like it may do the business. We're just going to do a few quick performance checks (we typically wrap Elk which is not always performant for anonymous expression queries?). It's definitely an advantage not to pollute the ontology or import chain with materialized expressions.

phillord commented 9 years ago

The question is about the OWL API in general or the reasoner? I don't believe that I had any major problems with the reasoner API (except that .dispose() method -- I found that hermit keeps state in strange ways if you don't call this).

The OWL API itself, is complex partly because of it's design and partly because OWL is just very big. In terms of the OWL API, the change system is added complexity. And the docs are very problematic because they often just say things like "getExistentialRestriction" -- "returns an existential restriction". Tawny-OWL has the same problem of course. One of my things to do is to come up with bite-sized explanations of the semantics of all the parts of OWL, stick them onto ontogenesis, then we can hyperlink to them.

With Tawny-OWL I have tried to shield users (including myself!) from as much of this as possible. How much I have succeeded I do not know.