Annotation specification of `properties` could be improved

gregsdennis commented 2 years ago

I had a realization today about what the text in section 10.3.2.1 says about the annotation result of properties

The annotation result of this keyword is the set of instance property names matched by this keyword.

How I interpreted this

I originally took this to mean that instance properties that were listed in properties but were not matched merely weren't included in the annotation output. So in this schema and instance

{
  "properties": {
    "foo": true,
    "bar": false
  },
  "additionalProperties": true
}

{ "foo": 1, "bar": 1 }

properties would produce [ "foo" ] because foo would always pass, and bar would never pass. The impact of this would be that bar would then also be evaluated by additionalProperties, which would pass.

This never really made sense to me because /properties/bar would fail validation overall anyway, which means it doesn't matter that bar passes additionalProperties.

My realization

I now see that this requirement was stated this way to address the annotation result of the case that not all properties were present in the instance rather than the case where one of the properties failed validation.

This means that in the schema above and the instance { "foo": 1, "baz": 1 }, properties would produce [ "foo" ] because bar isn't present in the instance and baz isn't listed in properties.

So that means that properties produces an annotation of the intersection of the properties it declares and the properties the instance contains, regardless of whether the property's value passes the associated subschema.

In the second case, it generally doesn't matter whether additionalProperties is depends on the annotation because the schema's going to fail anyway.

Why does this matter?

It matters in error output. To illustration this, let's change additionalProperties to false:

{
  "properties": {
    "foo": true,
    "bar": false
  },
  "additionalProperties": false
}

{ "foo": 1, "bar": 1 }

In my original interpretation, additionalProperties evaluates bar and produces an error.

In my new interpretation, it doesn't evaluate bar and so no error is produced.

What am I asking?

Can we improve this text to more clearly specify which of these cases (or a third case, even) is intended? Personally, I like the "intersection" language I used above.

I think it was the "matched by this keyword" that threw me off. I took "matched" to mean "successfully validated."

(I think this has come up somwhere in a conversation between @karenetheridge and me.)

handrews commented 2 years ago

So that means that properties produces an annotation of the intersection of the properties it declares and the properties the instance contains, regardless of whether the property's value passes the associated subschema.

Correct.

I think it was the "matched by this keyword" that threw me off. I took "matched" to mean "successfully validated."

No, the word "matched" was used for properties and patternProperties instead of "passed" on purpose. We could use "present in the instance" for properties, I guess. [EDIT: In retrospect, "matched" reads well for patternProperties but is ambiguous for properties].

This would also be solved by having annotation output tests, because if I've learned anything in spec writing its that no matter what words you choose, someone will read them differently.

handrews commented 2 years ago

Looking into the wording now. To fix this properly, we also need to revamp the introduction of "A Vocabulary for Unevaluated Locations" which does not clearly distinguish between evaluation via dynamic subscopes of in-place applicators (which must succeed to be considered evaluated) or evaluation by adjacent keywords (which, as noted here, are considered evaluated regardless of outcome).

karenetheridge commented 2 years ago

So that means that properties produces an annotation of the intersection of the properties it declares and the properties the instance contains, regardless of whether the property's value passes the associated subschema.

That is correct. We had this conversation in March, and I fixed my implementation as a consequence -- it cleared up a large portion of the complaints I had about unevaluated* producing too many redundant (and therefore confusing) errors. Some of the discussion should be in other github issues somewhere. If I recall, there was a single test in the test suite that flipped its expected value as well.

The specific notes I have on this (in my commit messages) are:

Keywords always propagate annotations from subschemas, even if its own evaluation was false. The false result of the containing subschema will eventually cause the annotations to be discarded, but in the meantime sibling keywords should be allowed to see these annotations.

Note that this means an applicator keyword such as properties can produce both an annotation and an error at the same time, from the same schema location + data location tuple. This may have implications on the design of output formats.

edit: https://github.com/json-schema-org/json-schema-spec/pull/1154 looks to cover some of this.

gregsdennis commented 2 years ago

We had this conversation in March

I remember having this discussion, but I drew a different conclusion (which seems to have resulted in a similar-enough outcome). The change I made to my implementation was merely to drop annotations at the schema object level.

What I didn't do was update so that annotations at the keyword level were still produced.

So for

{
  "properties": {
    "foo": true,
    "bar": false
  },
  "additionalProperties": false
}

{ "foo": 1, "bar": 1 }

My new understanding would have bar in the annotation result from properties even though its subschema failed, whereas my previous understanding simply wouldn't produce that annotation (because its subschema failed). I interpreted "matched" as "subschema passes." I would carry on to drop all of the annotations at the schema object (which in this case is the root).

json-schema-org / json-schema-spec