thephpleague / json-guard

Validation of json-schema.org compliant schemas.
http://json-guard.thephpleague.com/
MIT License
175 stars 26 forks source link

Internal references question #96

Closed Magomogo closed 7 years ago

Magomogo commented 7 years ago

Hello everybody,

I'm having problems to validate against the schema having internal references:

{
    "id": "http://foo.bar/baz.json#",
    "definitions": {
        "entity": {
            "type": "object",
            "properties": {
                "name": {"type": "string"}
            }
        },
        "entities": {
            "type": "array",
            "items": {"$ref": "#/definitions/entity"}
        }
    }
}

When I do dereferencing of http://foo.bar/baz.json#/definitions/entities the resulting schema has no http://foo.bar/baz.json#/definitions/entity definition. Is this a bug or feature?

matt-allan commented 7 years ago

Hello,

When I do dereferencing of http://foo.bar/baz.json#/definitions/entities the resulting schema has no http://foo.bar/baz.json#/definitions/entity definition.

Do you mean the dereferenced schema at /definitions/entities/items has no /definitions/entity or do you mean it's dropped from where it was originally in the schema?

When the schema is dereferenced it should basically look like this:

{
    "id": "http://foo.bar/baz.json#",
    "definitions": {
        "entity": {
            "type": "object",
            "properties": {
                "name": {"type": "string"}
            }
        },
        "entities": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"}
                }
            }
        }
    }
}

But we can't fully resolve it because the schema might have circular references. Instead we replace it with a reference object, which is a proxy. You should see this:

>>> $schema = $deref->dereference(json_decode(file_get_contents(__DIR__ . '/schema.json')))
=> {#213
     +"id": "http://foo.bar/baz.json#",
     +"definitions": {#219
       +"entity": {#197
         +"type": "object",
         +"properties": {#218
           +"name": {#205
             +"type": "string",
           },
         },
       },
       +"entities": {#220
         +"type": "array",
         +"items": League\JsonGuard\Reference {#222},
       },
     },
   }

...which can resolve to the referenced schema:

>>> $schema->definitions->entities->items->resolve()
=> {#197
     +"type": "object",
     +"properties": {#218
       +"name": {#205
         +"type": "string",
       },
     },
   }

Hopefully that helps. Let me know if there are still issues.

Magomogo commented 7 years ago

The problem is I'd like to validate data not against the the http://foo.bar/baz.json but http://foo.bar/baz.json#/definitions/entities. Following your example I do:

$schema = $deref->dereference('file://' . __DIR__ . '/schema.json#/definitions/entities')))

As I result I've got a part of initial schema. As I understand this piece of code do this. This part contains Reference object, but it points to missed #/definitions/entity.

Validation gives me this exception:


exception 'League\JsonGuard\Pointer\InvalidPointerException' with message 'The pointer referenced a value that does not exist.  The value was: "definitions"' in /srv/www/rest/vendor/league/json-guard/src/Pointer/InvalidPointerException.php:31`
matt-allan commented 7 years ago

Oh ok, I see what you mean now.

The dereferencer resolves internal references under the assumption that the referenced schema will exist in the current document. Since your schema resolves the pointer, that isn't the case.

We need to resolve internal references just like external references to make this work. First we make the reference absolute by prepending the current uri and get file://some/path/schema.json#/definitions/entity (Dereferencer@makeReferenceAbsolute). Then we use the loader to load it.

If we do this for every internal reference the validator will get a lot slower since it will be doing unnecessary file system reads and/or network requests. We can probably check if the uri matches the uri of the current schema before loading it.

Anyway, in the mean time you can dereference the schema before resolving the pointer:

$schema = $deref->dereference('file://' . __DIR__ . '/schema.json')));
$schema = (new League\JsonGuard\Pointer($schema))->get('/definitions/entities');
Magomogo commented 7 years ago

Thanks for the workaround, I'll try it.

RoboSparrow commented 7 years ago

+1

just polishing @yuloh's great workaround

$schema = (new League\JsonGuard\Dereferencer())->dereference('file://' . __DIR__ . '/schema.json');
$schema = (new League\JsonGuard\Pointer($schema))->get('/definitions/entities');
itsjavi commented 7 years ago

Thanks for the workaround it also works for me. I hope the library becomes more smart fixing this and also with some kind of caching mechanism if it is not there already, so it would be more performant.

matt-allan commented 7 years ago

I just tagged 1.0.0-alpha which includes a fix for this. The reference attempts to use a local pointer but falls back to using the dereferencer.

I'm going to close this issue but feel free to re-open it if you encounter this issue again.