During validation, the jsonschema library raises a very rich, complicated ValidationError object that represents a tree of error causes. Hologram then wraps this in validate, in part calling jsonschema.exceptions.best_match(validator.iter_errors(data)). For 99% of cases, this works great, but it falls apart in Unions where the "best match" is frequently not helpful. As the author of a complicated JsonSchemaMixin, I sometimes have my own heuristic I'd like to use (for instance: if the error is about this key, it's least likely to be the issue).
The jsonschema.exceptions.best_match actually allows you to supply a key function that will be used for prioritizing the errors. I would like to be able to override it during my class's validate, ideally without explicitly reaching into jsonschema itself. Here's an example of something I have written currently:
def _relevance_without_strategy(error: jsonschema.ValidationError):
# calculate the 'relevance' of an error the normal jsonschema way, except
# if the validator is in the 'strategy' field and its conflicting with the
# 'enum'. This suppresses `"'timestamp' is not one of ['check']` and such
if 'strategy' in error.path and error.validator in {'enum', 'not'}:
length = 1
else:
length = -len(error.path)
validator = error.validator
return length, validator not in {'anyOf', 'oneOf'}
@dataclass
class ParsedSnapshotNode(ParsedNode):
resource_type: NodeType = field(metadata={'restrict': [NodeType.Snapshot]})
# this is a union of 3 types that are differentiated by the "strategy" key: "check", "timestamp", or "anything else"
config: Union[
CheckSnapshotConfig,
TimestampSnapshotConfig,
GenericSnapshotConfig,
]
@classmethod
def validate(cls, data: Any):
schema = hologram._validate_schema(cls)
validator = jsonschema.Draft7Validator(schema)
error = jsonschema.exceptions.best_match(
validator.iter_errors(data),
key=_relevance_without_strategy,
)
if error is not None:
raise hologram.ValidationError.create_from(error) from error
That's gross! The only thing I actually wanted to override was the key function passed to best_match. A nice interface might be this on JsonSchemaMixin:
@classmethod
def _best_match_key(cls) -> Callable[[jsonschema.ValidationError], Any]:
# this is the default
return jsonschema.exceptions.relevance
Then in JsonSchemaMixin.validate, it would also pass key=cls._best_match_key() along to jsonschema.exceptions.best_match.
During validation, the
jsonschema
library raises a very rich, complicated ValidationError object that represents a tree of error causes. Hologram then wraps this invalidate
, in part callingjsonschema.exceptions.best_match(validator.iter_errors(data))
. For 99% of cases, this works great, but it falls apart in Unions where the "best match" is frequently not helpful. As the author of a complicatedJsonSchemaMixin
, I sometimes have my own heuristic I'd like to use (for instance: if the error is about this key, it's least likely to be the issue).The
jsonschema.exceptions.best_match
actually allows you to supply a key function that will be used for prioritizing the errors. I would like to be able to override it during my class's validate, ideally without explicitly reaching intojsonschema
itself. Here's an example of something I have written currently:That's gross! The only thing I actually wanted to override was the key function passed to
best_match
. A nice interface might be this onJsonSchemaMixin
:Then in
JsonSchemaMixin.validate
, it would also passkey=cls._best_match_key()
along tojsonschema.exceptions.best_match
.