python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
207 stars 40 forks source link

Ref resolve failure when using custom metaschema #293

Closed stenbein closed 1 year ago

stenbein commented 1 year ago

On July 12th it was noticed that our github action using check-jsonschema was failing. No update had been applied to the code so it was believed to be a build time/runtime issue. The following error appears:

Failure resolving $ref within schema

_RefResolutionError: Expecting value: line 1 column 1 (char 0)
  in "/root/.cache/pre-commit/repoef89ztua/py_env-python3.9/lib/python3.9/site-packages/check_jsonschema/checker.py", line 77
  >>> result = self._build_result()

  caused by

  JSONDecodeError: Expecting value: line 1 column 1 (char 0)
    in "/root/.cache/pre-commit/repoef89ztua/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 1095
    >>> document = self.resolve_remote(url)

    caused by

    JSONDecodeError: Expecting value: line 1 column 1 (char 0)
      in "/root/.cache/pre-commit/repoef89ztua/py_env-python3.9/lib/python3.9/site-packages/requests/models.py", line 971
      >>> return complexjson.loads(self.text, **kwargs)

      caused by

      StopIteration: 0
        in "/usr/local/lib/python3.9/json/decoder.py", line 353
        >>> obj, end = self.scan_once(s, idx)

        caused by

        KeyError: 'https://json-schema.org/draft/2020-12/meta/meta/applicator'
          in "/root/.cache/pre-commit/repoef89ztua/py_env-python3.9/lib/python3.9/site-packages/jsonschema/validators.py", line 1092
          >>> document = self.store[url]

In 'https://json-schema.org/draft/2020-12/meta/meta/applicator' the second meta doesn't exist. If you monkey patch the code to replace 'meta/meta/' with 'meta' the code will proceed to fail on meta/meta/meta/. Eventually it will run correctly if you repeat this several times.

Back ground: We have a slightly modified Draft202012 metaschema with an added "unevaluatedProperties": false to catch typos. This is run as a github action using the following pre-commit config.

 - repo: https://github.com/python-jsonschema/check-jsonschema
    rev: 0.20.0
    hooks:
      - id: check-jsonschema
        files: ^src/common/schemas/.*\.json$
        args: [--schemafile, src/common/config/meta_schema.json]

EDIT: Using the meta schema directly with jsonschema does not cause an error. The error occurs when using check-jsonschema v 0.8 or later.

sirosen commented 1 year ago

Hi, thanks for reporting this! It sounds like a relative URI resolution issue -- possibly (from some very cursory poking around) a dynamicRef problem.

I'm actually in the midst of updating ref resolution (#289), so there's a body of work which may fix this issue. But I'd like to have a test case to confirm the fix. Are you able to share your modified metaschema? Or, barring that, can you describe how, if at all, it differs from the one provided by json-schema.org here: https://json-schema.org/draft/2020-12/meta/unevaluated ?

I'll try to look into this bug in the coming days, but I can't promise any specific timeline yet.

stenbein commented 1 year ago

@sirosen it is exactly this schema https://json-schema.org/draft/2020-12/schema with two added properties:

        "$$target": {
            "$comment": "\"$$target\" is supported for use with sphinx \"$$target\". The definition is lifted from https://json-schema.org/draft/2020-12/meta/core",
            "$ref": "meta/core#/$defs/uriReferenceString",
            "deprecated": false
        }
    },
    "unevaluatedProperties": false

However, using the default schema alone also fails. Instead it will pass (though of dubious utility to do so) if these are removed.

    "allOf": [
        {"$ref": "meta/core"},
        {"$ref": "meta/applicator"},
        {"$ref": "meta/unevaluated"},
        {"$ref": "meta/validation"},
        {"$ref": "meta/meta-data"},
        {"$ref": "meta/format-annotation"},
        {"$ref": "meta/content"}
    ],

I'm open to the idea that we're doing it wrong™ as well. Just trying to understand what is happening before we decide what we want to do about it.

sirosen commented 1 year ago

I've just been able to circle back to this and add a test case based on the information you provided which reproduces the issue.

The test passes on the work in #289, so I feel quite confident that this is/was a bug in reference resolution which is resolved by that change.

The new test is now part of that branch, so we can be confident that this will be fixed when that's released. I'm still working through some additional tests which I need to add, after which I think it will be safe to merge. Until then, I'm not sure that there's a good workaround possible within check-jsonschema.

stenbein commented 1 year ago

Since this broke without us updating the check-jsonschema version. After jsonschema updated, with the requirements unpinned for that library in the build... is it possible that update caused this?

sirosen commented 1 year ago

Sorry to be slow to get back to you; yes, it's definitely possible that something broke in an upstream update and percolated down to you. July 12th correlates with a few bugfixes which were released in jsonschema, specifically with the now-legacy RefResolver.

I've just merged the change to use the new implementation, and should have a check-jsonschema release out shortly. So I'm going to feel comfortable closing this soon. If you want a deeper analysis, I can help get this reported upstream to jsonschema in a format which will be digestible there, but I don't think it's strictly necessary.

sirosen commented 1 year ago

check-jsonschema v0.24.0 has been released. Please let me know right away if it doesn't resolve this problem or if you see other issues!

stenbein commented 1 year ago

@sirosen looks like it works! Or at least the error is new. Will take it from here, I think that solved the issue.

electriquo commented 1 year ago

check-jsonschema v0.24.0 has been released. Please let me know right away if it doesn't resolve this problem or if you see other issues!

@sirosen upgrading from 0.23.3 to 0.24.0 caused this error to occur. for instance, it happens with:

i think that the issue needs to be reopened. what do you advise?

sirosen commented 1 year ago

My concern here is that these schemas may contain errors, and that by changing the implementation to be more correct, those errors are now apparent. The facts that the old implementation worked on them and other validators like AJV might work doesn't necessarily prove that the schemas are well-formed. $ref resolution is a little bit subtle.

This definitely merits investigation, but I'd prefer to do it in a separate issue rather than reopening this one. The causes of these errors are different and I'd like to keep the threads separate.

sirosen commented 1 year ago

Hm. I just reread what I thought was an error in one of those schemas and I was mistaken; it looks fine. I'll need to look more deeply to get a clear idea of what's happening.

(But expect me to file a new issue shortly.)

electriquo commented 1 year ago

@sirosen

expect me to file a new issue shortly

will be waiting closely. thank you :)

sirosen commented 1 year ago

@foolioo, could you file a new issue with details on what you're seeing? Including not just a reference to the source schema but whatever error you're seeing? I just tried the pre-commit-hooks schema and this worked fine:

$ check-jsonschema --schemafile https://json.schemastore.org/pre-commit-hooks.json <(echo '[{"id":"foo","name":"bar","entry":"qux","language":"r"}]')    
ok -- validation done

I jumped the gun earlier by starting to look for an explanation before confirming that I could reproduce the issue.

electriquo commented 1 year ago

@sirosen #297