SEMICeu / Core-Business-Vocabulary

This is the issue tracker for the maintenance of Core Business Vocabulary
17 stars 4 forks source link

Some properties defined insufficiently in RDFS version of the vocabulary #33

Closed jakubklimek closed 1 year ago

jakubklimek commented 1 year ago

In the RDFS version of the vocabulary some properties are defined insufficiently. Namely:

dc:alternative a rdf:Property;
  rdfs:label "alternative name"@en .

<http://www.w3.org/ns/legal#companyActivity> a rdf:Property;
  rdfs:label "legal entity activity"@en .

<http://www.w3.org/ns/legal#companyStatus> a rdf:Property;
  rdfs:label "legal entity status"@en .

<http://www.w3.org/ns/legal#companyType> a rdf:Property;
  rdfs:label "legal form type"@en .

<http://www.w3.org/ns/legal#legalName> a rdf:Property;
  rdfs:label "legal name"@en .

They are all missing domains and ranges, meaning they can be used with any resource. This is, however, inconsistent with the UML version of the diagram showing they are properties of LegalEntity: image

EmidioStani commented 1 year ago

Hello @jakubklimek ,

domain and range are constraints that should not be part of the RDF instead it should be in the SHACL shapes, there I would investigate on the "identifies" property which has a range.

martinnec commented 1 year ago

Hello @EmidioStani, the domain and range should be part of the RDFS. SHACL is just a syntax, not the semantics. RDFS should express the semantics. Domain and range are semantic constraints.

jakubklimek commented 1 year ago

Hello @EmidioStani , could I ask, just to clarify, why we should validate domains and ranges via SHACL, but not specify them in RDFS?

EmidioStani commented 1 year ago

By not putting domain and range, it would be possible to reuse such properties in other data models, which is the point of a Core Vocabulary. If you go on (LOV) you cannot reuse many properties because of their domain or range.

As you might know, domain and range have been removed since some time in Dublin Core because they are too rigid, and they have been replaced in all the properties by domainIncludes and rangeIncludes to facilitate the reuse.

By separating the model from their constraints you maximise the reuse vs its conformance.

jakubklimek commented 1 year ago

@EmidioStani, rdfs:domain and rdfs:range are not constraints. They do not prevent you from using the predicates with instances of other classes. They just say that the subjects and objects become instances of domain and range classes respectively. Since the Core Vocabs classes are very generic, I do not see the harm in that. E.g. what is wrong with inferring that resources using companyType are instances of LegalEntity?

On the contrary, without a domain and a range specification, it is impossible to reason about the instances using these predicates and use the RDFS vocabualary to help modeling actual data.

Moreover, if you are concerned about constraints preventing reuse, why define them in SHACL then where they actually are used for validation and therefore act as real constraints?

bertvannuffelen commented 1 year ago

This is good discussion, but I would like to take this (and related issues in other data specifications) to a broader context.

To streamline the SEMIC community expectations, we are working on a styleguide describing the (key) artefacts, their dependencies and what the expected content is. This work is done because we realize that we need, as community, to be clearer on what we expect from data specifications in order for others to reuse our specifications in a trusted and coherent way. With the objective that we as community also can reuse their data specifications. Over the past decades, several variations emerged, hindering natural reuse.

We will soon sent out an invite to a webinar (planned mid January) on this topic.

EmidioStani commented 1 year ago

This issue can be closed, the discussion about domain and range should be at styleguide level first, the issue related is: https://github.com/SEMICeu/style-guide/issues/60