python-jsonschema / jsonschema

An implementation of the JSON Schema specification for Python
https://python-jsonschema.readthedocs.io
MIT License
4.61k stars 581 forks source link

After updating to 4.18.1 earlier code fails with `jsonschema.exceptions._RefResolutionError: 'bytes' object has no attribute 'timeout'` #1124

Closed Alexander-Serov closed 1 year ago

Alexander-Serov commented 1 year ago

Here is the traceback. I believe something has changed in the fresh release 4.18.1 from today because our CI pipeline stopped working, the code has not be changed.

I cannot provide the minimal working example since jsonschema is not our direct dependency but is rather used by Great Expectations. Still, the traceback clearly suggest there is a difference in how jsonschema is implemented.

We downgraded back to 4.17.* and it fixes the problem.

Traceback (most recent call last):
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1087, in resolve_from_url
    document = self.store[url]
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/_utils.py", line 20, in __getitem__
    return self.store[self.normalize(uri)]
KeyError: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1090, in resolve_from_url
    document = self.resolve_remote(url)
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1194, in resolve_remote
    with urlopen(uri) as url:
  File "/var/lang/lib/python3.8/urllib/request.py", line 222, in urlopen
    return opener.open(url, data, timeout)
  File "/var/lang/lib/python3.8/urllib/request.py", line 515, in open
    req.timeout = timeout
AttributeError: 'bytes' object has no attribute 'timeout'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/builds/repo/repo/expectations.py", line 8, in <module>
    from great_expectations.checkpoint import SimpleCheckpoint
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/__init__.py", line 6, in <module>
    from great_expectations.data_context.migrator.cloud_migrator import CloudMigrator
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/data_context/__init__.py", line 1, in <module>
    from great_expectations.data_context.data_context import (
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/data_context/data_context/__init__.py", line 1, in <module>
    from great_expectations.data_context.data_context.abstract_data_context import (
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/data_context/data_context/abstract_data_context.py", line 119, in <module>
    from great_expectations.rule_based_profiler.data_assistant.data_assistant_dispatcher import (
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/data_assistant/__init__.py", line 1, in <module>
    from .data_assistant import DataAssistant
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/data_assistant/data_assistant.py", line 22, in <module>
    from great_expectations.rule_based_profiler.data_assistant_result import (
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/data_assistant_result/__init__.py", line 1, in <module>
    from .data_assistant_result import DataAssistantResult
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/data_assistant_result/data_assistant_result.py", line 48, in <module>
    from great_expectations.rule_based_profiler.altair import AltairDataTypes, AltairThemes
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/altair/__init__.py", line 1, in <module>
    from .encodings import AltairDataTypes
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/altair/encodings.py", line 6, in <module>
    class AltairDataTypes(Enum):
  File "/builds/repo/venv/lib/python3.8/site-packages/great_expectations/rule_based_profiler/altair/encodings.py", line 8, in AltairDataTypes
    QUANTITATIVE = alt.StandardType("quantitative")
  File "/builds/repo/venv/lib/python3.8/site-packages/altair/vegalite/v4/schema/core.py", line 15771, in __init__
    super(StandardType, self).__init__(*args)
  File "/builds/repo/venv/lib/python3.8/site-packages/altair/utils/schemapi.py", line 177, in __init__
    self.to_dict(validate=True)
  File "/builds/repo/venv/lib/python3.8/site-packages/altair/utils/schemapi.py", line 338, in to_dict
    self.validate(result)
  File "/builds/repo/venv/lib/python3.8/site-packages/altair/utils/schemapi.py", line 443, in validate
    return jsonschema.validate(
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1295, in validate
    error = exceptions.best_match(validator.iter_errors(instance))
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/exceptions.py", line 441, in best_match
    best = next(errors, None)
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 359, in iter_errors
    for error in errors:
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/_validators.py", line 284, in ref
    yield from validator._validate_reference(ref=ref, instance=instance)
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 452, in _validate_reference
    scope, resolved = resolve(ref)
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1076, in resolve
    return url, self._remote_cache(url)
  File "/builds/repo/venv/lib/python3.8/site-packages/jsonschema/validators.py", line 1092, in resolve_from_url
    raise exceptions._RefResolutionError(exc)
jsonschema.exceptions._RefResolutionError: 'bytes' object has no attribute 'timeout'
syvillegas commented 1 year ago

Same as @Alexander-Serov here, literally same problem and same temporary solution for now.

Julian commented 1 year ago

I can't fix or address anything without some code.

ocefpaf commented 1 year ago

Looks like this commit is the cause. If I remove the "REMOVEME" code in validators.py things seems to behave as expected. (Still investigating though.)

Julian commented 1 year ago

That is essentially the only commit in the release -- but come on folks, it's bug filing 101 -- you need to include some sort of code to reproduce what you're reporting here -- all tests in this package pass, so whatever you're running needs an example.

ocefpaf commented 1 year ago

That is essentially the only commit in the release

Made it easy to find ;-p

That is essentially the only commit in the release -- but come on folks, it's bug filing 101

Sure. Like I mentioned above, still investigating. It is hard to dig into 5 levels of dependencies, in my case.

syvillegas commented 1 year ago

When doing: pip install great-expectations==0.15.50 jsonschema==4.18.0 this line: from great_expectations.checkpoint import SimpleCheckpoint works. However, with: pip install great-expectations==0.15.50 jsonschema==4.18.1 that same line fails and it says: jsonschema.exceptions._RefResolutionError: 'bytes' object has no attribute 'timeout'

jpmckinney commented 1 year ago

Fine on 4.18.0. Breaks on 4.18.1.

from jsonschema import Draft4Validator, RefResolver

class CustomRefResolver(RefResolver):
    def __init__(self, *args, **kw):
        super().__init__(*args, **kw)

    def resolve_remote(self, uri):
        return {}

schema = {
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "object",
  "properties": {
    "x": {
        "$ref": "#/definitions/x"
    },
    "y": {
        "$ref": "https://standard.open-contracting.org/schema/1__1__5/release-schema.json"
    }
  },
  "definitions": {
    "x": {"type": "string"}
  }
}

list(Draft4Validator(schema, resolver=CustomRefResolver("", schema)).iter_errors({"x": 1}))

yields

jsonschema.exceptions._RefResolutionError: Unresolvable JSON pointer: 'definitions/x'
Alexander-Serov commented 1 year ago

That is essentially the only commit in the release -- but come on folks, it's bug filing 101 -- you need to include some sort of code to reproduce what you're reporting here -- all tests in this package pass, so whatever you're running needs an example.

As I told you we would love to, but this is not our direct dependency :) So was hoping the others would contribute.

💪🏻 teamwork

Julian commented 1 year ago

Great, thanks for the example, should be fixed momentarily in 4.18.2.

vincentsarago commented 1 year ago

Hi, as reported by @jpmckinney in https://github.com/python-jsonschema/jsonschema/issues/1124#issuecomment-1632574249 I'm also getting

jsonschema.exceptions._RefResolutionError: Unresolvable JSON pointer: 'definitions/assets'

The error is happening for a pretty large FastAPI application but I'll try to deep dive to create a reproductible example (https://github.com/developmentseed/titiler/actions/runs/5529545166/jobs/10087758840)

This is happening for 4.18.1 and for the newest 4.18.2

Edit: This is all coming from one of our dependency https://github.com/stac-utils/pystac/issues/1186

BTOdell commented 1 year ago

Also running into the "Unresolvable JSON pointer" error for 4.18.2. I'm using a custom yaml ref resolver and validating using the Draft202012Validator.

self = <func.utils.YamlRefResolver object at 0x7f4215e01720>
document = {'$id': 'http://json-schema.org/draft-07/schema#', '$schema': 'http://json-schema.org/draft-07/schema#', 'default': Tr...'type': 'array'}, 'simpleTypes': {'enum': ['array', 'boolean', 'integer', 'null', 'number', 'object', ...]}, ...}, ...}
fragment = 'components/schemas/PipelineElement'

    def resolve_fragment(self, document, fragment):
        """
        Resolve a ``fragment`` within the referenced ``document``.

        Arguments:

            document:

                The referent document

            fragment (str):

                a URI fragment to resolve within it
        """
        fragment = fragment.lstrip("/")

        if not fragment:
            return document

        if document is self.referrer:
            find = self._find_in_referrer
        else:

            def find(key):
                yield from _search_schema(document, _match_keyword(key))

        for keyword in ["$anchor", "$dynamicAnchor"]:
            for subschema in find(keyword):
                if fragment == subschema[keyword]:
                    return subschema
        for keyword in ["id", "$id"]:
            for subschema in find(keyword):
                    f"Unresolvable JSON pointer: {fragment!r}",
                )
E               jsonschema.exceptions._RefResolutionError: Unresolvable JSON pointer: 'components/schemas/PipelineElement'

What's strange is that it seems to be searching for the fragment in the Draft-7 meta schema document instead of my local OpenAPI spec document and I'm not even using Draft-7.