python-jsonschema / check-jsonschema

A CLI and set of pre-commit hooks for jsonschema validation with built-in support for GitHub Workflows, Renovate, Azure Pipelines, and more!
https://check-jsonschema.readthedocs.io/en/stable
Other
192 stars 38 forks source link

Version 0.24.0 Failure resolving $ref within schema #297

Closed electriquo closed 11 months ago

electriquo commented 11 months ago

after upgrading from version 0.23.3 to version 0.24.0, jsonschema fails with Failure resolving $ref within schema, see below

Failure resolving $ref within schema

_WrappedReferencingError: Unresolvable: https://json.schemastore.org/pre-commit-hooks.json#/definitions/file_types
  in "/Users/foolioo/.cache/pre-commit/repoavw68h25/py_env-python3.11/lib/python3.11/site-packages/check_jsonschema/checker.py", line 78
  >>> result = self._build_result()

  caused by

  Unresolvable: https://json.schemastore.org/pre-commit-hooks.json#/definitions/file_types
    in "/Users/foolioo/.cache/pre-commit/repoavw68h25/py_env-python3.11/lib/python3.11/site-packages/jsonschema/validators.py", line 446
    >>> resolved = self._resolver.lookup(ref)

    caused by

    Unretrievable: 'https://json.schemastore.org/pre-commit-hooks.json'
      in "/Users/foolioo/.cache/pre-commit/repoavw68h25/py_env-python3.11/lib/python3.11/site-packages/referencing/_core.py", line 586
      >>> retrieved = self._registry.get_or_retrieve(uri)

      caused by

      UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte
        in "/Users/foolioo/.cache/pre-commit/repoavw68h25/py_env-python3.11/lib/python3.11/site-packages/referencing/_core.py", line 336
        >>> resource = registry._retrieve(uri)

the execution is via pre-commit, here are more details:

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/python-jsonschema/check-jsonschema
    rev: 0.24.0
    hooks:
      - id: check-jsonschema
        name: check-jsonschema pre-commit
        files: \.pre-commit-config\.yaml
        types:
          - yaml
        args:
          - --schemafile
          - https://json.schemastore.org/pre-commit-config.json
sirosen commented 11 months ago

What is the usage which causes this? I haven't been able to reproduce, so I suspect there's more to this than meets the eye.

I'd also be curious to hear what platform (Linux, macOS, Windows) and python version you're using, in case it's related.

electriquo commented 11 months ago

@sirosen issue was updated with more details

electriquo commented 11 months ago

I haven't been able to reproduce

@sirosen where you able to reproduce given all the (updated) details in the issue's description?

sirosen commented 11 months ago

Yes; just this morning I found that I needed to use an input that would cause traversal of one of the problematic $refs, but that it is possible. I still haven't had enough free time to look into why this fails, but using

# example-pcc.yaml
repos:
- repo: https://github.com/sirosen/nosuchrepo
  rev: 100
  hooks:
    - id: spam
      types: [text]

then the following fails with the above error:

$ check-jsonschema --schemafile https://json.schemastore.org/pre-commit-config.json example-pcc.yaml

The key thing being that types: ... is specified, which causes a lookup against another remote schema, which somehow fails.

sirosen commented 11 months ago

I've tracked it down to a use of .raw vs .content from requests. I'm still trying to decide if I'm embarrassed or not; but its a very simple mistake in the changes for this past release.

.raw shows the verbatim response, even if it's gzipped or otherwise encoded. .content shows bytes after gzip inflation, etc.

Using the wrong one leads to a failing parse, but the underlying cause is a little bit inscrutable because of

I've got the fix queued up and should be able to release it today.

sirosen commented 11 months ago

I was just able to circle back on this and push it out as v0.24.1.

As always, just let me know if you see issues; and thanks for reporting this and providing all of the detail!

electriquo commented 11 months ago

@sirosen 0.24.1 works now flawlessly