NLP2RDF / ontologies

All ontologies used in NIF 2.0 (NIF-Core + vocabulary modules + helper modules)
http://nlp2rdf.org
36 stars 7 forks source link

Are lots of sub-classes and sub-properties needed? #17

Open VladimirAlexiev opened 8 years ago

VladimirAlexiev commented 8 years ago

Consider are the consequences of making sub-classes and sub-properties. Economy of representation (number of triples) is an important consideration to keep NLP as RDF a feasible idea (because NLP generates a lot of data), and NIF 2 thought carefully about that (counting triples for the Simple, Stanbol and OpenAnnotation profiles). Injudicious use of sub-classes and sub-properties might induce NIF users to abandon RDFS... or NIF itself.

VladimirAlexiev commented 8 years ago

Eg currently there is:

This means that

The question is whether people will want to query by these extra types/properties or not. If not, they are only a burden.

BTW, about complex class constructs such as Restrictions and unions, eg:

nif:oliaProv rdfs:range [ a owl:Class; owl:unionOf (prov:Activity prov:Agent ) ] ;

IMHO are not very useful: formally speaking, RDFS should infer this union as one of the types of every oliaProv object. I guess you use them to be able to use RDFUnit. But maybe it's better to use Shapes; or in the above case schema:domainIncludes / rangeIncludes.

kurzum commented 8 years ago

hm, I see your point. However, we would need a formal way of marking describing extensions of NIF somehow, especially, when it comes to external vocabs like itsrdf.

So let's do it like this:

This will:

kurzum commented 8 years ago

we can move the domain/range to the extra files

VladimirAlexiev commented 8 years ago

I like the approach: "if you want inference, load this extra file".

You had something like this in the old NIF: nif-core-inf.ttl and nif-core-val.ttl were separate (though nif-core-inf.ttl was used for more complex inferences, like Transitive and Restrictions; domain/range were in nif-core).

But more thinking is needed:

BTW, we now see that "modules" can be created for different purposes: by feature (eg Annotation vs Translation), by function (eg definitions/comments vs domain/range).

So "module" and "namespace" are orthogonal concepts.

neradis commented 8 years ago

Are lots of sub-classes and sub-properties needed?

The answer will depend on the use case and the user of NIF. Some of these abstract super-properties and super-classes were introduced to express conceptual commonalities (conceptual interoperability), to allow for OWL reasoning/constraints or just to formalise expectation about the format of NIF documents. Abstract properties and types are helpful for exploratory queries von NIF data where one does not know beforehand which concrete annotation statement occur.

On the other hand there is indeed also the need for triple-economy when trying to achieve larger volumes of NIF data and some potential users (esp. the ones with no prior Sem. Web background) not interested in OWL-benefits would certainly also welcome a pure RDF(S) version of NIF without the conceptual overhead of OWL.

I think it's feasible (probably not even much effort) to write some code that could down-grade NIF OWL schema documents to two RDF(S) documents:

All three versions could (and probably should) use the same namespace and the down-graded versions would just be a subset of the RDFS inference closure over the OWL version. Offering consistent versions of per module directly side be side might be easier for users than import declarations (which are only part of OWL anyway).

This would not only allow to circumvent unwanted reasoning bloat (although one usually has control over the entailment regime applied by stores/tools anyway), but also offer NIF newcomers a stepstone, allowing them to adopt NIF only with knowledge about RDF(S).

VladimirAlexiev commented 8 years ago

I like this idea of "profiles".

I've done something similar for https://github.com/erlangen-crm/ecrm using this script https://github.com/erlangen-crm/ecrm/blob/master/ecrm-simplify.xq