jhthorsen / json-validator

:cop: Validate data against a JSON schema
https://metacpan.org/release/JSON-Validator
56 stars 58 forks source link

Use YAML::PP #205

Closed perlpunk closed 3 years ago

perlpunk commented 4 years ago

Note: This is a draft.

Summary

Use YAML::PP instead of YAML::XS. Make it a hard requirement (since it's pureperl).

Motivation

YAML::PP supports YAML 1.2, while YAML::XS only supports a subset/mixture of YAML 1.1/1.2. Also it's pureperl.

Because YAML::PP is much slower than YAML::XS, it might be useful to support both via an environment variable or option.

Also YAML::PP::LibYAML could be supported (a bit faster than YAML::PP, slower than YAML::XS, but supports YAML 1.2 (regarding loading integers/booleans etc.))

For reference about YAML 1.2 I made this table: https://perlpunk.github.io/YAML-PP-p5/schemas.html Default is YAML 1.2 Core Schema.

karenetheridge commented 4 years ago

I thought there was a PR that made YAML::XS a hard requirement (to use it in data_checksum, which will be much faster than Data::Dumper), but now I can't find it... hmmm...

perlpunk commented 4 years ago

Hm, a hard requirement on an XS module has the disadvantage that you can't use it everywhere, and for example can't create a standalone version with Fatpacker.

Anyway, it would still be nice to be able to use YAML::PP and YAML 1.2 here.

karenetheridge commented 4 years ago

(Ah, it was #202.)

Anyway, it would still be nice to be able to use YAML::PP and YAML 1.2 here.

Agreed! đź‘Ť on your work here.

perlpunk commented 4 years ago

I should note though, that YAML::XS will probably be able to read most JSON Schema YAML files, even in YAML 1.2. I'm not sure what the spec says, but the OpenAPI spec says, that files should be written according to the YAML 1.2 JSON Schema, that means no TRUE or True, always true (and similar rules for other types). I could imagine that JSON Schemas are usually written according to these rules too.

Since YAML::XS mostly implements that subset, it should be fine in most cases. (It only misses hexadecimal and octal values and .inf/.nan, but I would guess that these are rare in JSON Schema files). It currently dies when it sees %YAML 1.2 at the top, but that will be fixed soon, I hope. See this table for a pretty complete list of special values: https://perlpunk.github.io/YAML-PP-p5/schema-examples.html

jhthorsen commented 4 years ago

It is very nice to see that YAML::PP also have proper boolean support! I haven’t had any complaints about requiring YAML::XS so far, so I’m leaning toward keeping (adding) that as a hard dependency.

perlpunk commented 4 years ago

It is very nice to see that YAML::PP also have proper boolean support!

YAML::PP had this boolean support from the beginning, and I implemented it in YAML::XS after YAML::PP ;-)

jhthorsen commented 4 years ago

I'm not quite sure why this PR is work in progress. I don't mind adding this feature but please add support for YAML::XS in the same (or similar) way as I do with Sereal in b68e7cdae06123477234c5a91867254d34b7610b.

perlpunk commented 4 years ago

It's work in progress because I simply replaced YAML::XS with YAML::PP to show how it would look like. I think your commit b68e7cd didn't exist yet :)

But looking at it - if you want the same logic here, I'm guessing you would want to always use YAML::XS if it's available, and only YAML::PP if it's not. I think it could be useful to be able to choose what YAML Loader to use.

jhthorsen commented 4 years ago

I'm still not sure if YAML::PP has any advantages over YAML::XS. I do understand that it might be easier to install, but until someone complains I think YAML::XS will be what most people want since it's faster.

I'll review this branch again after "Draft" is removed.

perlpunk commented 4 years ago

The idea was to create a Proof of Concept how YAML::PP support would look like, not to replace YAML::XS. Ideally the user would be able to choose what module to use, and personally I don't care what the default is.

But I don't know in which way you would want to make that configurable (if at all). Via an environment variable? Or some constructor argument?

One advantage would be to be able to fatpack a script using JSON::Validator. I have a use case for that, and currently I can't use the validation functionality of that script because of YAML::XS.

Another advantage is that YAML::PP implements the YAML 1.2 Core schema correctly, and YAML::XS doesn't. The cases where a YAML file would be loaded differently are probably rare in the OpenAPI and JSON Schema context, and I listed the differences here: https://perlpunk.github.io/YAML-PP-p5/schema-examples.html Just saying that YAML::PP is more correct.

jhthorsen commented 3 years ago

Thanks for all the pointers. This should be fixed in master now.