Open siuc-nate opened 3 months ago
For what it is worth I tend to use the latter approach, gYear, gYearMonth and the like.
That's what I'm currently leaning towards as well.
@siuc-nate one fly in the ointment with allowing more than one datatype for the date properties is that currently the context file has things like
"ceterms:dateEffective": {
"@type": "xsd:date"
},
That would have to be removed and cannot be replaced if the datatype for values could be one of several options. The only way to specify the data type would be in the instance data, e.g.
"@id": "http:example.org/something",
"ceterms:dateEffective": {
"@type": "http://www.w3.org/2001/XMLSchema#gYear",
"@value": "2011"
}
Alternatively, use "duck typing" on ingest (from the options in the property's range) as we discussed on the last call.
I wasn't aware of that limitation of JSON-LD until just now. That's really unfortunate and doesn't make much sense to me, but appears to be true. I went to see how schema.org handles multi-typing in the schema.org context file and found the following, some of which are odd to me:
This makes me question the usefulness of providing anything in the context file after the first block that provides what the URL prefixes map to. A consumer of the data is going to need to get the actual schema one way or another in order to find out what the ranges of properties are (especially for properties that are explicit URLs vs general URIs, as discussed elsewhere (#819)).
Also, the JSON-LD playground complains if you try this:
Perhaps we should remove all of the @type
references from our context as well, so as not to mislead consumers of the data?
I wasn't aware of that limitation of JSON-LD until just now. That's really unfortunate and doesn't make much sense to me,
I'm afraid it's necessary. The context block maps values in the JSON objects to RDF, and it can only do a one-to-one mapping that works for all values, so yeah if you give it an array and say the value may be any one of these three datatypes you're not giving it information that will work.
Perhaps we should remove all of the @type references from our context as well, so as not to mislead consumers of the data?
Then everything would be xsd:strings, which probably isn't an improvement.
Setting the range to xsd date, gMonth, gYear and the @type in the context to schema:Date would not be wrong as those are all formats that are compatible with being a date value in ISO 8601 date format.
The alternative is Rohit's suggestion of a different property for each datatype.
Then everything would be xsd:strings, which probably isn't an improvement.
Wouldn't that also be a problem for schema.org's context? They don't specify a type value (in the context) for the vast majority of their properties, not even ones that would make sense as @type: @id
. I think we create more confusion than we solve by specifying some types, but not others, or specifying them with a lesser degree of precision than that provided by the schema:rangeIncludes values for those properties. If people can figure out how to use schema.org's context with that as a limitation, it should work for us too, I'd think.
Schema.org has a convention that you can use a string value for most things instead of an @id
and puts the onus on data consumers to figure out what is meant:
"We also expect that often, where we expect a property value of type Person, Place, Organization or some other subClassOf Thing, we will get a text string, even if our schemas don't formally document that expectation. "
https://schema.org/docs/datamodel.html
That makes being a data consumer of schema.org harder. Not an example I think we should follow.
I don't think data providers do try to figure out how to use terms based on the @context
, they mostly work from examples or the terms definition. The @context
is just one way of getting from JSON to RDF, and we should make sure the context file we provide does as good a job of that as possible.
Current proposals, per today's meeting:
Option one: Change the range of the properties listed above to use schema:Date instead, but limit our accepted variations to those described by xsd:date, xsd:gMonthYear, and xsd:gYear
Option two: Change the range of the properties listed above to use xsd:date, xsd:gMonthYear, and xsd:gYear
@philbarker Can you check these with your JSON-LD validation? Ideally, the second approach would work, since I think that satisfies the issues that were brought up in today's meeting:
{
"@type": "ceterms:Certificate",
"ceterms:dateEffective": {
"@type": "schema:Date",
"@value": "2024-10"
}
}
{
"@type": "ceterms:Certificate",
"ceterms:dateEffective": {
"@type": "xsd:gYearMonth",
"@value": "2024-10"
}
}
I was talking with @rohit-joy today and he raised some good ideas for how to tackle this:
The intent is to make the data easier to consume and use for a variety of potential/likely use cases.
@siuc-nate OK, that makes sense.
btw,
{
"@type": "xsd:gYear",
"@value": "2024"
}
is indeed valid & correct.
We have current use cases that require support for partial dates, e.g.
2024-08
and/or2024
, which I thought were supported by xsd:date, but actually are not (ISO8601, which xsd:date is based on, supports them, but not xsd:date itself). https://www.w3.org/TR/xmlschema-2/#date https://www.w3.org/TR/xmlschema-2/#isoformats https://www.w3.org/TR/xmlschema-2/#truncatedformatsI can see two potential solutions:
The second option is probably the safer of the two, as it avoids opening the door to other variations/truncations ISO8601 allows. That would also give us the flexibility to allow/deny, for example, YYYY-MM-DD and YYYY-MM formats but not YYYY formats for some properties (if we ever wanted to do that).
We only need to update the properties where we specifically want to allow partial dates, but if we were to update all properties with a range of xsd:date, the list would be:
Note: I can't think of a case where we need to worry about partial xsd:dateTimes, so we're probably okay with those properties as-is, but they're worth considering, too.