pboettch / json-schema-validator

JSON schema validator for JSON for Modern C++
Other
466 stars 134 forks source link

Binary support #114

Closed andrejlevkovitch closed 3 years ago

pboettch commented 3 years ago

Thanks for the PR.

I checked it out and did some modifications - especially I created a content-checker-callback.

Regarding the binary-type extension, we must be very careful. I'd like to stick to json-schema and there is no "type": "binary" in draft7. This cannot be added to the validator.

But, nlohmann::json supports this binary-type, so why not used in some way.

Here's my proposition (close to your idea I think).


We cannot use explicitly in a schema: {"type": "binary"} or "type": ["binary", "number"] we have to be implicit.

For a schema, (if no type is given OR type is string OR string is in type-list) AND "contentEncoding" is set to "binary", an instance of type json::value_t::binary is accepted.

The contentEncoding-callback has to be provided and is called accordingly.


Without starting to code (I already changed things partly) - What do you think of this formulation?

andrejlevkovitch commented 3 years ago

I agree with you. This will be great!

pboettch commented 3 years ago

I tried to implement something sticking to my proposition... it's a mess.

It would be so much simplier to just add a "type": "binary"-specific kludge to this library. And not use contentEncoding at all.

It could be justified by the nlohmann-json-type binary that this library supports that as it is completely based on nlohmann's JSON.

andrejlevkovitch commented 3 years ago

@pboettch add binary keyword is simpler, but json-schema (nowadays) don't support this keyword (and, I think, it will never be add).

pboettch commented 3 years ago

At the same time, a schema using type=binary or contentEncoding=binary would not work with any other validator.

Even an JSON-instance with binary would not even exist with other json-libraries.

So what you would want to do is totally limited to this validator and Nlohmann's JSON.

Why don't we simply validate binary-jsons (json::value_t::binary) as string? And you add a contentEncoding=binary for more investigation optionally?

andrejlevkovitch commented 3 years ago

At the same time, a schema using type=binary or contentEncoding=binary would not work with any other validator.

I don't think so. If you use type=binary then, of cause, it makes impassible to use any other validator. But, contentEncoding is acceptable field - if validator don't has implementation for this field it must just ignore it. Of course, you can't validate binary json (like bson for example) by this validator, but! you can transform binary data to string by base64 algorithm or just get light json from binary-json where all binary data will be changed to empty string (or some default value).

Why don't we simply validate binary-jsons (json::value_t::binary) as string?

Because it can has unexpected results, for example if you add pattern or format option. Of course, if add several checks about contentEncoding and binary type we can avoid this problem.

pboettch commented 3 years ago

Why should the validator then simply not ignore binary-values?

JSON-schema is permissive: if you do not specify a behavior it is accepted by the validator.

andrejlevkovitch commented 3 years ago

@pboettch sorry I didn't understand about binary-values ...

pboettch commented 3 years ago

binary json instances. The validator could simply ignore a binary-value.

What is your use-case where a binary value needs validation?

andrejlevkovitch commented 3 years ago

binary json instances. The validator could simply ignore a binary-value.

You mean "type": "binary"? Yes, it can ignore, but! if uses many json schemas then for checking uses more strong meta-schema. For example: if you type: "addtionalProperties": false - then it's can be too complicated to find this mistype without strong meta-schema. So, if add new value for type it means that meta-schema also must be fixed

What is your use-case where a binary value needs validation?

I use the validator for validate some bson objects that contains images. So I also add in validator checking contentMediaType for binary objects

pboettch commented 3 years ago

Could you please check my branch https://github.com/pboettch/json-schema-validator/tree/andrejlevkovitch-binary_support to see whether this implementation is OK for you?

andrejlevkovitch commented 3 years ago

I think here is a several problems (IMHO):

  1. If you set contentEncoding and/or contentMediaType and don't set checker, then you get error at validation time, but, I think, it must produce error while parse schema (the sooner the better).

  2. contentEncoding with binary value broke array of types - I think this isn't right behaviour

  3. when contentEncoding is binary and instance is a string or binary then we get 2 errors: about unexpected type and from checker. Otherwise we get one error (about unexpected type). It confused.

andrejlevkovitch commented 3 years ago

I create another pull request #116 with changes that can fix described problems

pboettch commented 3 years ago

You could have push-forced on this pull-request (via your branch). Please close it here otherwise.

andrejlevkovitch commented 3 years ago

@pboettch ok update this pullrequest

pboettch commented 3 years ago

I had to select your commits manually because you merged them with an old branch. Could you please check the master branch if all of your changes have arrived there?

Thanks for your contribution.

andrejlevkovitch commented 3 years ago

I don't see any problems - all work as expected