w3c / dxwg

Data Catalog Vocabulary (DCAT)
https://w3c.github.io/dxwg/dcat/
Other
144 stars 46 forks source link

Profiles may provide rules on cardinality of terms (including “recommended”) [ID41] (5.41) #276

Open nicholascar opened 6 years ago

nicholascar commented 6 years ago

Entered from Google Doc

kcoyle commented 5 years ago

Some possible text:

Profiles MAY include cardinality constraints for entities and elements. (Using these terms for now; we'll need to be consistent across the document.) The primary constraints that are used have the following meanings:

In data modeling and programming languages, constraints are frequently defined numerically with a two-digit expression. The first digit represents the minimum cardinality and the second the maximum. This expression is interpreted to include mandatory (minimum cardinality is greater than zero) and repeatable (maximum cardinality is greater than one). There is no such formal definition in for recommended or mandatory if optional.

Other formal methods for defining constraints include the symbols used in regular expressions (need link), where the symbol "+" means "one or more" and "?" means zero or one. Specific numbers can also be given within brackets such as "{1,3}" meaning "at least one but no more than three."

The methodology to use for cardinality is often determined by the data modeling or programming language that is used to encode the profile vocabulary. In profiles that are expressed as textual documents, the terms used, such as "mandatory", must be carefully defined in the document itself.

(Can we make this SHOULD and give a reason?) (There is the question of allowed elements with no specific constraints. These can be included in SHACL and ShEx as valid terms with no cardinality.)

kcoyle commented 5 years ago

Examples:

SHACL

ex:MinCountExampleShape
    a sh:PropertyShape ;
    sh:targetNode ex:Alice, ex:Bob ;
    sh:path ex:name ;
    sh:minCount 1 .

ShEx

my:UserShape {
   foaf:name xsd:string:
   foaf:mbox IRI+
}
larsgsvensson commented 5 years ago

In SHACL you can also use sh: severity sh:Warning and sh:severity sh:Info (cf. https://www.w3.org/TR/shacl/#severity) to tell the validator that it's not an error condition if a constraint validation fails. This could be used to implement RECOMMENDED and OPTIONAL constraints.

kcoyle commented 5 years ago

I can understand OPTIONAL using SHACL, but I'm not so sure about RECOMMENDED, although presumably one could return the message "recommended" when the constraint fails - if, as I recall, one can craft specific output messages for individual validation steps. On its own, something being "missing" but not an error doesn't express RECOMMENDED to me, but it does express OPTIONAL. Then again, you could have a community agreement that "sh:warning" means "recommended" but that may not translate outside of the immediate community.

Has anyone seen examples of SHACL used for metadata input, and not just validation? I'm curious as to how well it would express a user-facing profile.

(added) Here's a SHACL version of the schema.org vocabulary: http://datashapes.org/schema. This describes the vocabulary for validation, and for each schema.org class lists the related properties. It describes the vocabulary without constraints; constraints are in a separate example file).

nicholascar commented 5 years ago

Has anyone seen examples of SHACL used for metadata input, and not just validation? I'm curious as to how well it would express a user-facing profile.

TopBraid, the tool, uses SHACL to generate forms that then are used to collect human input according to the SHACL Shape and such use (form generation) was always one of TopQuadrant's SHACL Use Cases. We've also recently had a student work on making HTML forms from SHACL shapes graphs that can be used for human input.

rob-metalinkage commented 5 years ago

Having recommendations and warnings that do not constitute validation errors suggests that we may need to provide an example of a profile where the implementing resource has multiple roles - validation and guidance. is input form specification a separate role too?

My feeling is that the profiles ontology should provide pre-defined definitions for such obvious roles, especially where they directly support implementation of profile guidance recommendations. ( But also allow these to be refined and extended for specific application)

rob-metalinkage commented 5 years ago

Mandatory, repeatable, not repeatable are redundant if you have min and max cardinality. given OWL and SHACL support the latter, is it just adding potential for confusion to include additional equivalent options?

kcoyle commented 5 years ago

@rob-metalinkage While implementable profiles will undoubtedly use min and max, if you look at human-facing instructions you will find the use of "mandatory, repeatable, etc." along with "recommended" and at times "mandatory if applicable" "recommended if applicable". These latter cannot be expressed mathematically. That means that we have two kinds of cardinality - one that can be expressed with min/max, and one that comes in the form of input instructions. For human-facing documentation, the min/max rules can be expressed as "mandatory...etc.". It depends on who you are "talking" to at that moment. And I think we do need to make that equivalence in the document if we intend to cover the human-facing aspect of profiles.

rob-metalinkage commented 5 years ago

I dont see it too hard to use such terms based on cardinality constraints - but what you are doing has two aspects: 1 formalising constraints 2 recommending specific language to explain constraints

probably best not to confuse these.

The actual guidance could be something along the lines of 1) where applicable fully or partially describe profiles with a formal constraint language appropriate to the specification being profiled, for example use SHACL or SHEX to constrain RDF vocabularies. 2) Offer some formalism for constraints not expressible via standard languages (unless DXWG wants to spin up a micro-ontology to support those we identify - such as "mandatory when" ) 3) Provide human readable documentation of the constraints defined by the profile using the following language....

kcoyle commented 5 years ago

@rob-metalinkage said: " is input form specification a separate role too?"

My feeling about the roles is that they need to be a separate vocabulary namespace and it should be easy to add roles. I can imagine someone having a document that is written as input to a user interface in a particular software who would want to express: "input to Libris UI".

One place where I think we'll have difficulty is in the roles surrounding validation - describing exactly what is being validated may be hard to define in a role. I see mentions of constraints and part constraints in the profileDesc vocabulary, but what they mean is not going to be easy to express in a standard way. I was already thinking of opening an issue on validation - I'll do that now. (Done #449)

nicholascar commented 5 years ago

My feeling about the roles is that they need to be a separate vocabulary namespace

Yes, agreed, we want to do the pattern that the ISO does of a data model (ontology) that's fairly fixed but vocabs (instances) that can be publsihed somewhat separately and added to.

For this reason we kept the ResourceRole insances separate from the ontology. The listed roles are clearly far from complete but indicate where roles should be placed.

kcoyle commented 5 years ago

@rob-metalinkage said: "example of a profile where the implementing resource has multiple roles" -

I do know of use cases (using Europeana vocabulary, as I recall) so if needed we can have a real example of a profile that returns information about guidance from a validation step. It's easy to find profiles that include UI information (display forms, short definitions) and even complex input rules. In fact, the DCAT-AP is a good example of this.