dcmi / dctap

DC Tabular Application Profile
https://dcmi.github.io/dctap/
33 stars 10 forks source link

Default cardinality #9

Closed tombaker closed 3 years ago

tombaker commented 3 years ago

In Issue #7 we are considering whether we need to support a default way to express boolean values for mandatory and repeatable. Do we also need to express default values for those elements. Put another way, do we need to express default assumptions w.r.t. to the cardinality of statement constraints?

I propose the following:

As people will not always follow our guidelines, however, we could frame this as a suggestion, point out that it fits common expectations of profile users, and is defined as the default for ShEx (and perhaps other schema languages as well?). At the same time, we could acknowledge that profile creators may have much different default assumptions and encourage profile consumers to be cautious in the absence of explicitly defined values for mandatory and repeatable.

As an aside: The original Dublin Core (1995) defined all elements as optional and repeatable (ie, mandatory = False, repeatable = True). Over the years, it came to be understood that these were, in effect, the defaults for a metadata vocabulary and that, if expressed as a profile, these constraints define a profile that is extremely tolerant, ready to match any elements found in the data - or even none. Indeed, this radical permissiveness was seen by some as evidence that Dublin Core was not really a standard at all.

kcoyle commented 3 years ago

Without defaults there is simply no constraint on cardinality, and that is what makes the most sense to me. For folks who leave off the mandatory/repeatable columns from their table it may not be obvious that they are actually setting cardinality rules. With no cardinality defaults there is simply no validation that would be done on cardinality. In effect, for data creation it is left up to the humans creating the data; for data validation, you take what you get.

The "exactly one arc" makes sense to me but I'm not sure that it will resonate with anyone not steeped in RDF. I could also see the reverse: mandatory = false; repeatable = true - which would be the "loosest" option. In any case, I'm wary of setting defaults.

kcoyle commented 3 years ago

Feb 3 meeting decided to defer a decision on defaults, but was comfortable with the wording in the primer:

In the absence of these cardinality constraints, applications using this profile will need to assume default values of their own choosing. It is recommended to indicate these requirements in the profile to avoid misunderstandings about the nature of the metadata.