dcmi / dctap

DC Tabular Application Profile
https://dcmi.github.io/dctap/
34 stars 10 forks source link

Requiring one of two properties #98

Open sfolsom opened 1 year ago

sfolsom commented 1 year ago

I've tried to find in the DCTap specification if there's a way to capture that one of two properties/values is required, but I haven't found an obvious path forward.

Our use case is that in BIBFRAME you can signal that something is a monograph/text using one of two properties (rdf:type or bf:content), and when making a shape for monographs, we want to say one or the other needs to be present. (Similar to sh:or, https://www.w3.org/TR/shacl/#OrConstraintComponent)

Is this possible currently in DCTap? My hunch is that it might be too much to ask of a tab format, or not provisioned for yet.

philbarker commented 1 year ago

@sfolsom The general rule of thumb / heuristic is that "multiple values in any table cells are to be treated as alternatives to each other" (from the DCTAP Primer), so in principle putting two entries in the propertyID column for a row (i.e. StatementTemplate with two propertyIDs) would indicate that they are alternatives. [ we certainly stopped short of defining formal semantics]

As far as I know we haven't tested this with regard to how it transforms to SHACL (though I too have a need to do this), but this our (or at least my) thinking at the moment:

PropertyID PropertyLabel mandatory valueShape
ex:address ex:phone Contact true
ex:address AddressShape
ex:phone PhoneShape

Which I would read as a Contact property, either ex:address or ex:phone is mandatory; ex:address and ex:phone specified by their own StatementTemplates (in ways not shown). You could add a Note element to the "contact" PropertyTemplate to clarify what is meant.

As I say, I too have a need to express this information, and to turn it into SHACL, so please let me know if you see anything you think won't work. There could be hidden demons when building out the SHACL Shapes.

sfolsom commented 5 months ago

@philbarker I finally had reason to think about this more closely, and I think what you've written above (multiple properties within the same cell) works for our purposes. As you said though, we'll have test this in the DCTap to SHACL conversion tools we're using.

Note... Previously I had misread DCTAP documentation, and thought it meant there could only be one propertyID, not that there had to be one propertyID column.

kcoyle commented 5 months ago

Looking at this:

PropertyID PropertyLabel mandatory valueShape
ex:address ex:phone Contact true
ex:address AddressShape
ex:phone PhoneShape

I'm a bit nervous about leaving the mandatory column blank, but I don't know what the conclusion is when the two properties are coded as mandatory=true. Is the "X or Y" mandatory? or are X and Y both mandatory. If the set of "XorY" is mandatory can the statements for X and Y be optional? That's what we have in the Cookbook.

Vaguely, I wonder if we don't need a way to encode a mandatory blank node that would then have optional or mixed mandatory/optional properties. We can't do this now because the propertyID is required. (Is this going too far? Should I forget it?)

sfolsom commented 5 months ago

I read it as an OR when multiple are listed in a cell, and one is optional if the other is present. It seems the only way to bind two properties to say one or the other. If both are mandatory regardless of the other, that can be captured in two separate rows. This works for our use case.

The blank node for properties is an interesting question, though I can't think of a need for our project's work. Our use cases are predicated (pun intended) on specific properties for specific types of things. In theory the only reason I can think to allow for a blank node for a property is to say, "We don't care what property you use, but this thing/literal needs to be linked to/asserted. Again, we don't have that use case.