RepreZen / KaiZen-OpenApi-Parser

High-performance Parser, Validator, and Java Object Model for OpenAPI 3.x
130 stars 31 forks source link

Validations may be executed multiple times for the same object #204

Open andylowry opened 6 years ago

andylowry commented 6 years ago

This happens, for example, when a value is incorporated by one or more references, perhaps in addition to as an inline element.

It's not enough to skip reference targets in validation, since some values may appear only via reference.

Probably what's needed is an IdentityHashSet of validated objects, maintained during validation.

This is actually much subtler than it appears. For example, perhaps there's a case where the same numeric value must be non-negative in one context but not in another. Clearly, if the more constrained context is encountered later than the other, it would be a mistake to suppress the non-negative check just because the number has already been validated. Thought is needed.

tedepstein commented 6 years ago

@andylowry, Some thoughts:

It's not enough to skip reference targets in validation, since some values may appear only via reference. Probably what's needed is an IdentityHashSet of validated objects, maintained during validation.

Agreed, we do need to validate the referenced objects, and we don't want to repeat that part of it.

This is actually much subtler than it appears. For example, perhaps there's a case where the same numeric value must be non-negative in one context but not in another.

I think we have to distinguish between validation contexts. The first time we see a reference to a given object:

  1. Validate the referent object in its own context. If it's a Schema Object, for example, it needs to be a valid Schema, independent of any referrer.
  2. Add it to the IdentityHashSet so we don't have to validate it again (in its own context)
  3. If there are constraints that apply to the object in the referring context, e.g. the referrer needs the Schema Object to be primitive-typed, then we need to validate those rules, and mark any failure as an error on the referrer, probably the line containing the $ref property, or maybe its immediate container.

Clearly, if the more constrained context is encountered later than the other, it would be a mistake to suppress the non-negative check just because the number has already been validated. Thought is needed.

I am assuming we need to validate the referent object in every context where it's referenced, but only if the referrer has constraints that apply to that object, and only those constraints. We should not need to re-validate the object in its own context more than once. In the above example, one we've verified that the referent Schema Object follows the rules that apply intrinsically to all Schema Objects, we should not need to re-evaluate those rules.