linked-statistics / xkos

A SKOS extension for statistical classifications
35 stars 8 forks source link

Add property to specify a notation pattern #131

Open nichtich opened 5 years ago

nichtich commented 5 years ago

As discussed at #130 we use a property called notationPattern to express a regular pattern that all notations (skos:notation) of a Concept Scheme must conform to. See this list of real-world examples derived from Wikidata.

By now the notation pattern is not expressed in RDF. XKOS seems like a good place to add this property.

tfrancart commented 5 years ago

To me this rather looks like the definition of a custom RDF datatype, and typically in such cases the value of the skos:notation uses the custom datatype, e.g.

  ex:concept1 skos:notation "123"^^ex:myNotationDatatype .
nichtich commented 5 years ago

I've not seen any sensible use of RDF datatypes for controlled vocabulary notations in real life. Apart from that the datatype approach would require at least two additional RDF properties and an additional datatype URI (which in most cases just does not exist because vocabulary publishers use plain strings for notations):

ex:concept1 skos:notation "123"^^ex:myNotationDatatype .
ex:myNotationDatatype xxx:pattern "^[0-9]+$" .
ex:concept1 skos:inScheme ex:scheme1 .
ex:scheme1 xxx:notationsUseDatatype ex:myNotationDatatype .

The use case is checking whether a given string conforms to the notation pattern of a given concept scheme.

tfrancart commented 5 years ago

I've not seen any sensible use of RDF datatypes for controlled vocabulary notations in real life.

Although I cannot provide pointers right now, this is quite common I think for vocabularies defined by European Commision and Office of Publications. I will try to find some pointers.

The datatype approach is the one advocated by SKOS. See the SKOS Primer and section 6.5.1 of the SKOS reference :

By convention, the property skos:notation is only used with a typed literal in the object position of the triple, where the datatype URI denotes a user-defined datatype corresponding to a particular system of notations or classification codes.

For many situations it may be sufficient to simply coin a datatype URI for a particular notation system, and define the datatype informally via a document that describes how the notations are constructed and/or which lexical forms are allowed. Note, however, that it is also possible to define at least the lexical space of a datatype more formally via the XML Schema language, see [SWBP-DATATYPES] section 2.

No property exist however to associate a ConceptScheme with the expected datatype of its Concept notations.

delcada commented 5 years ago

We also use xkos:notationPattern in a similar way as @nichtich, for instance to describe the structure of the various levels of a statistical classification (e.g. \b(V|X|XV|X{1,2}I?|X?IX|X?IV|X?V?I{1,3})\b for section names in the Combined Nomenclature.

tfrancart commented 5 years ago

xkos:notationPattern does already exist in the specification. (see section 6).

nichtich commented 5 years ago

Well, if it's ok to also apply xkos:notationPattern to a skos:ConceptScheme without levels, this issue can be closed.

nichtich commented 2 months ago

An its current version, xkos:notationPattern has domain xkos:ClassificationLevel. We use the property with domain skos:ConceptScheme as well, so the domain should be extended to be used for normal concept schemes without classification levels.

tfrancart commented 2 months ago

I've not seen any sensible use of RDF datatypes for controlled vocabulary notations in real life

Now I can provide a pointer : European Parliament Open Data Portal makes use of the "custom datatype pattern", for skos:notation on Works : https://europarl.github.io/eli-ep/#work (look for "skos:notation" in the table). This is not controlled vocabularies though in the strict sense, these are document identifiers.

It should also be noted that SHACL has the sh:pattern property to achieve the same goal as xkos:notationPattern. It could be a good idea to leave the domain of xkos:notationPattern completely open so that it could apply to anything.