python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
192 stars 38 forks source link

Validating a deeply nested schema using custom "strict" draft 2020 metaschema can be very slow #264

Closed cweiske closed 11 months ago

cweiske commented 1 year ago

I am using check-jsonschema 0.23.0 with jsonschema 4.17.3 to validate a JSON schema against a strict json schema schema.

When running check-jsonschema, it runs for at least a minute until I kill it. Looks like an infinite loop somewhere. Validating against the "normal" json schema schema works.

$ check-jsonschema --schemafile ~/dev/tools/json-schema-schema/json-schema.schema.json api-rest-connect-stick-stick-xxx.schema.json 
ok -- validation done

$ check-jsonschema --schemafile ~/dev/tools/json-schema-schema/strict.json api-rest-connect-stick-stick-xxx.schema.json 
(ctrl-c pressed after 1 minute)

The schema I am validating against is this: https://github.com/orgs/json-schema-org/discussions/380#discussioncomment-5711007 The schema file I am writing (and which was generated by genson) is attached here: api-rest-connect-stick-stick-xxx.schema.json.gz

(I originally opened that against jsonschema but was told to file it here - https://github.com/python-jsonschema/jsonschema/issues/1097)

sirosen commented 1 year ago

I think this might have been a slight miscommunication on the jsonschema ticket in question, so I've replied there where it will be more visible to jsonschema. But I'm going to leave this open until we know the cause a bit better. It may be that check-jsonschema can and should optimize its usage of jsonschema (in which case the ticket belongs here), or it may be that this is a slow case for jsonschema inherently or some other issue.

On my machine, I was able to validate your example document, but it did take 5 minutes 58 seconds wall time.

cweiske commented 11 months ago

Fixed in https://github.com/python-jsonschema/jsonschema/issues/1097

sirosen commented 11 months ago

It's only 80% of the way fixed, I would say. :sweat_smile: check-jsonschema isn't using the new (faster) implementation from jsonschema yet. But it's an active work in progress.

I plan to check this example after the updates are done and see how it performs.

So please, keep an eye out for a future release which refers to the new referencing implementation!