python-jsonschema / jsonschema

An implementation of the JSON Schema specification for Python
https://python-jsonschema.readthedocs.io
MIT License
4.6k stars 580 forks source link

Unable to resolve a realtive `$ref` from a schema with `$id` pointed to a subschema in `$defs` with `$id` #1195

Closed rijenkii closed 10 months ago

rijenkii commented 10 months ago

Schema in question:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "schema:/main",
  "$ref": "./child",
  "$defs": {
    "child": {
      "$id": "schema:/child",
      "type": "string"
    }
  }
}

Exception:

>>> jsonschema.validate("123", schema)
Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/referencing/_core.py", line 336, in get_or_retrieve
    resource = registry._retrieve(uri)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/jsonschema/validators.py", line 108, in _warn_for_remote_retrieve
    request = Request(uri, headers=headers)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 318, in __init__
    self.full_url = url
    ^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/urllib/request.py", line 344, in full_url
    self._parse()
  File "/usr/lib64/python3.12/urllib/request.py", line 373, in _parse
    raise ValueError("unknown url type: %r" % self.full_url)
ValueError: unknown url type: './child'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/referencing/_core.py", line 586, in lookup
    retrieved = self._registry.get_or_retrieve(uri)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/referencing/_core.py", line 343, in get_or_retrieve
    raise exceptions.Unretrievable(ref=uri)
referencing.exceptions.Unretrievable: './child'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.12/site-packages/jsonschema/validators.py", line 446, in _validate_reference
    resolved = self._resolver.lookup(ref)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/referencing/_core.py", line 590, in lookup
    raise exceptions.Unresolvable(ref=ref)
referencing.exceptions.Unresolvable: ./child

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.12/site-packages/jsonschema/validators.py", line 1305, in validate
    error = exceptions.best_match(validator.iter_errors(instance))
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/jsonschema/exceptions.py", line 444, in best_match
    best = next(errors, None)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/jsonschema/validators.py", line 368, in iter_errors
    for error in errors:
  File "/usr/lib/python3.12/site-packages/jsonschema/_keywords.py", line 284, in ref
    yield from validator._validate_reference(ref=ref, instance=instance)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/site-packages/jsonschema/validators.py", line 448, in _validate_reference
    raise exceptions._WrappedReferencingError(err)
jsonschema.exceptions._WrappedReferencingError: Unresolvable: ./child

Package versions:

>>> importlib.metadata.version("jsonschema")
'4.20.0'
>>> importlib.metadata.version("referencing")
'0.31.0'
Julian commented 10 months ago

That's expected behavior at least for the moment given the URL implementation we use is Python's standard library.

You're defining some custom scheme of yours (schema:), there's not really a way for Python to know whether you intend for that custom scheme to support relative paths or not.

There are workarounds you can google which effectively involve touching global state in the urllib module, but really the "right" solution is to use either some scheme with well-defined semantics or else to use absolute references.

(But this behavior may change if/when referencing uses some other URL library as it may make different assumptions about custom schemes.)

rijenkii commented 10 months ago

Yeah, checked -- using http(s) works:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://schemas.example.org/main",
  "$ref": "./child",
  "$defs": {
    "child": {
      "$id": "https://schemas.example.org/child",
      "type": "string"
    }
  }
}
>>> print(jsonschema.validate("123", schema))
None

Feel free to close this issue or move it to referencing repo as a feature request then.