Open Julian opened 1 year ago
blows up if one follows the SHOULD
I don't follow your logic here. Following the SHOULD means that contentSchema
is ignored in your example. I see nothing wrong with that.
It doesn't mean that identifiers aren't still registered, just that the keyword isn't evaluated.
Here's the language from draft 7's $ref
definition:
All other properties in a "$ref" object MUST be ignored.
where it was essentially universally understood that this means identifiers are not registered. As far as a quick skim, no other language is present there, that's how we chose to communicate that behavior.
So I think you're asserting something (that there's only one interpretation for the current language) which isn't consistent with historical wording.
But at worst even if you're right there I'd still think it's unclear rather than really undefined.
I think the intention was that that quote is in the context of using the contentSchema
keyword as an assertion (which is optional, off-by-default behavior). It doesn't mean that the contentSchema
sub-schema should be ignored, it means that it shouldn't apply contentSchema
as an assertion on the parsed string data. I think it's a "SHOULD" in order to allow for but discourage heuristic-based parsing of the string value.
So, I think it would be wrong for your example to produce an error. The text is certainly ambiguous and should be clarified in the next release, but I weight intention pretty high when there is ambiguity in the text and I'm pretty confident that was the intention.
Fair enough maybe it's not needed for us to argue or agree on how to interpret the current draft (I suspect we probably all agree it's quite unlikely anyone has implemented anything related to this at all, let alone anything depending on the small nonsense point I'm raising) so perhaps consider this just a "it seems like this line needs tweaking for the next version" ticket.
@Julian @jdesrosiers is there an action here? Do we need to clarify anything in the current text?
If it were me my recommendation was to remove that second sentence given that we no longer allow contentSchema to be used as an assertion, and/or to make it invalid to have contentSchema
without contentMediaType
in the text and metaschema, but it's fairly minor given I think nearly no one does anything with these keywords so if others don't agree I'm fine if you close.
We did receive a question about this in Slack in the past week, so I'll look at how it can be made more explicit, but I don't think any requirements need to change.
I think whether or not this needs action depends on the resolution of #1288.
According to Henry, the example given wouldn't be valid. The schema in contentSchema
is not a subschema. It's an annotation that happens to be in the shape of a schema. So, the $ref
in that example is invalid because it's referencing a location that is known to not be a schema. That's true regardless of whether contentMediaType
is present.
However, if it's decided that contentSchema
is a subschema of the schema it appears in, then this becomes an issue we should probably address in the next release.
I agree that it's worth revisiting that conversation. Primarily, I'd like to identify (good) use cases for considering it a subschema.
I can't think of any good reasons someone would want to $ref
into that subschema (i.e. creating an $id
in that subschema that is then referenced from somewhere else), however it might be handy to be able to $ref
out of it, e.g. into $defs
.
If it's not a subschema, then even something as simple as
{
// ...
"contentSchema": { "$ref": "#/$defs/my-content" },
"$defs": {
"my-content": { /* ... */ }
}
}
wouldn't be possible.
There are other considerations as well. For instance, contentSchema
not being a subschema violates the principle of least surprise. People will assume this is a normal subschema and be confused when they find it's not, which will lead to a lot of questions we have to answer. Another consideration is ease of implementation. It will vary given different architectures, but I know one option would be problematic for me to implement while the other would require nothing at all.
The text of 2020-12's contentSchema says:
In #1288 which I've found, there was some back and forth about what the first sentence there means (which I agree with -- i.e. that the current spec seems to treat this schema as any other subschema, by virtue of omission). But the second sentence still has undefined behavior, because "SHOULD" there isn't clear.
Specifically, this schema (or others like it) when validating the instance 12:
either blows up if one follows the SHOULD or else returns invalid if one ignores the SHOULD about
contentMediaType
being missing. (FWIW of the implementations Bowtie supports, there appear to be at least 1 implementation with both behaviors, though unclear whether they explicitly intend to do so).It seems unnecessary to have undefined behavior in this case (one reminiscent of old
$ref
), so my personal opinion would be to remove that second sentence (and not intimate that it's ok to ignore it), or to just make it invalid to havecontentSchema
present withoutcontentMediaType
(and add adependentRequired
to the metaschema).(n.b. #1352 already tweaked the language here coming out of #1288, but the second sentence is still present.)