java-json-tools / json-schema-validator

A JSON Schema validation implementation in pure Java, which aims for correctness and performance, in that order
http://json-schema-validator.herokuapp.com/
Other
1.62k stars 399 forks source link

How to run hyper-schema JSON validation agains Base64 image properly #122

Open lehmannkai opened 9 years ago

lehmannkai commented 9 years ago

I am trying to run a JSON hyper-schema check against a base64 image definition using json-schema-validator. I am using the java version of json-schema-validator, version is 2.2.5.

My schema is:

{
"$schema": "http://json-schema.org/draft-04/hyper-schema#",
"title": "User object",
"description": "A user representation",
"type": "object",
"properties": {
    "email": {
        "description": "The user's email address",
        "format":"email",
        "maxLength": 255
    },
    "picture": {
        "description": "The user's picture",
        "type": "string",
        "media": {
            "binaryEncoding": "base64",
            "type": "image/png"
        }
    }
}
}

My json object is:

{"email":"k@w.de",
"picture":"ABCDE"}

The validate() method returns success. However, "ABCDE" is not a valid Base64 string. There should be an error entry in the validation report.

Also, would

"maxLength": 1024

be an appropriate way to limit the base64 size (image size) or is there another better way of defining this?

fge commented 9 years ago

OK, I see the problem.

The media keyword is defined by JSON hyper schema, not JSON schema proper; as such, it is not validated by the implementation.

Not sure what the intent of the spec will be here, since this scenario is not covered by either the validation spec or the hyperschema spec... @geraintluff, any thoughts?


As a workaround, since you are using my implementation, you can define a new keyword which mimics hyper schema's "media"; but since your use case is not defined by the specs, it will be a one off solution...

geraintluff commented 9 years ago

There's nothing stopping you from validating using the media keyword, but it's not mandated as it's not part of the validation spec. A validator that takes heed of hyper-schema keywords like this is sometimes called a "hyper-validator".

Any validation using hyper-schema keywords should be done in a second pass, after schema-assignment has already taken place, otherwise you might mess with oneOf/not clauses etc. - this is the same stage at which you would do hyperlink inference if you were interested in that. (@fge, does your library support schema assignment?)

(also, off-topic: "ABCDE" might well pass anyway, as the =-padding is often considered optional)