Validating mappings/usage of ontologies

dannylamb commented 7 years ago

Transferred from https://github.com/Islandora-CLAW/CLAW/issues/488. Please see that issue for the original conversation.

Title (Goal)	RDF UI: Validating mappings/usage of ontologies
Primary Actor	Repository admin
Scope	access
Level	Medium?
Story	As a repository admin, when I am mapping fields in RDF, I want the system to impose restrictions on field assignment if it violates the specification/model of the ontology - this could be reflected in the UI by greying out inappropriate classes/predicates or removing them as a possible select option. Restrictions includes but is not limited to domains & ranges of properties, subclasses, class disjoints and equivalence. Ideally, validation should happen across bundles.

ajs6f commented 7 years ago

Just to be clear, I hope we are not talking about OWL/RDFS here, because those languages do not impose restrictions of any kind of RDF. So hopefully, we are talking about something like SHACL or ShEx?

DiegoPino commented 7 years ago

Why not? Since when a OWL, e.b, on property restrictions, cardinality, etc are not valid? SHACL or ShEX opinionated and of course stronger, but if Ontology A says i should not connect class X with class Y via property lambda, should i ignore that? Or that rdf:type Z can't or should not have property omega? So you say things like https://www.w3.org/TR/owl2-primer/#Property_Cardinality_Restrictions are just hints we can/should ignore?

ajs6f commented 7 years ago

OWL (nor RDFS) have no notion of "valid" or "invalid" and it is not appropriate to introduce them in that context. OWL (and RDFS) can create new triples. That's it. Doing anything else with them is taking their syntax and making a new language with your own semantics. That is a really dangerous idea. We've had this conversation before, and this probably isn't the right place to repeat it.

The language of restriction in OWL is for restricting possible inferences, not restricting assertions in graphs generally.

DiegoPino commented 7 years ago

@ajs6f sorry, I can't agree with you on this. 😢

The language of restriction on OWL is exactly that what you say PLUS a way to help/assist and impose a way of building THAT ontology compliant and aligned RDF graphs.

If you choose to ignore it, then yeah, anything with properties will be valid RDF, mix and match == But not aligned to that particular Ontology and can even conflict /lead to wrong assumptions:

Imagine this: Let's use the SI bird ontology (which is written in OWL2 DL... semantic futurism.. i know such ontology does not exist). Don't enforce anything. So let's give a bird 5 wings and let it be a mammal also. But still say its a rdf:type bird under OWL ontology that says, via cardinality constrains that birds should have two wings. You get still valid RDF. But not usable. Is that what you want your metadata pros to do, know ontologies from memory? Be aware of their not compliant-ness after submitting and letting reasoning (a failing reasoning) on build RDF say.. "hey your statement of what a bird is is wrong" Of course if rdf:typing is not explicit, then yeah, reasoning will probably lead to a BIG NO and a BIG NOTHING in that ontology which is good, but if you are trying to build correct graphs that lead to a correct notion of a bird (for that ontology) assisting your metadata Professionals by validating their options and choices to help then describe their things in a way machines can parse/traverse and use seems to be a positive thing, but probably just my opinion.

Would love to read something not coming from our own local knowledge (yours or mine) that justifies that. Maybe Cambridge people?

Anyway. I'm pretty sure I had this talk with you many times, but also with @dannylamb and reasoning and validation will be possible via hooking/service but not part of RDF UX because there are speed concerns among others.

ajs6f commented 7 years ago

@DiegoPino , with all respect, you are simply wrong about what OWL is. It does not have the semantics you would like it to have. No one is arguing that validation or constraint are not useful or even necessary. But OWL is not the tool for the job.

I really don't think we should continue this conversation here, but since you asked for an argument from authority, you can read a good (if concise) explanation of the validation problem from Peter Patel-Schneider, who knows as much about this stuff as anyone. If that doesn't satisfy you, you can read a more extensive paper by Ian Horrocks et al. that explains the difference between OWL semantics and constraint semantics. If you want to argue with Ian Horrocks about the meaning of his own work, good luck.

DiegoPino commented 7 years ago

Hi, thanks. I'm aware i'm always 50% wrong and 50% without energy to defend my ideas, so all good. I already read those papers a time ago and now after reviewing them again to be sure i don't say more things that are wrong(which would lead me to over a 100% or wrongness!), does not seem to conflict with the idea that ontology OWL restrictions can be checked and "enforced/if not/checked/suggested/??" or whatever. SHACL is for sure more expressive but also not a standard, and not sure if our lovely islandora users want to write SHACL to give their birds two wing. Still a candidate in my mind of course and very useful.None of them papers say OWL restrictions are not valid, they say they are not enough. Which is good at least from my perspective.

trivia: Did you know that some triple stores, if inferencing is enabled can even potentially reject RDF triples that don't comply to ontologies loaded? That type of stuff is really my only concern, helping metadata people with their types, properties and letting them build beautiful graphs that make sense, not starting and semantic war. Thanks!

ajs6f commented 7 years ago

The question is not whether OWL restrictions are "valid" for some sense of that word. The question is what they mean.

Stardog is the only widely-deployed triplestore that offers constraint based directly on OWL syntax of which I am aware, and it is based directly on Clark and Parsia's earlier work on OWL ICV. In that case, the explicit choice was made to re-equip OWL syntax with well-defined closed-world semantics. Kendall Clark has done a good job explaining that choice:

Many people already think RDFS and OWL can be used for validation But semantics not suitable for validation So we defined constraint semantics for OWL axioms

The crucial point is that the semantics were in fact redefined and carefully. The standard OWL semantics do not produce constraint.

ajs6f commented 7 years ago

Good paper on early ICV/Stardog work.

ajs6f commented 7 years ago

@DiegoPino This is already a long conversation, but I don't want to leave it without making sure you understand that I completely support the functionality required by this issue and the use cases behind it. I am a huge believer in assuring data quality up front and not after the fact and that means being able to constrain and provide feedback to users as promptly as possible. (Although, as @dannylamb knows, I like to try to push people away from the binary valid/invalid opposition towards more flexible ideas of data quality and process-ability).

My only concern here is purely technical-- how do we best do what is asked for in this issue? I'm fully in support of the ask. (And I believe that it is totally possible to give this kind of feedback in "realtime" in the UI, and that so doing is a valuable project.)

Also, I don't know what you mean by saying you are "always 50% wrong". My statistics show that you are rarely more than 10% wrong. I would be happy to do that well ;)

Islandora-Labs / rdfux

Validating mappings/usage of ontologies #4