jhthorsen / json-validator

:cop: Validate data against a JSON schema
https://metacpan.org/release/JSON-Validator
56 stars 58 forks source link

'id' keyword is used to reference the schema instead of '$id' #114

Closed merkys closed 6 years ago

merkys commented 6 years ago

Throughout the code 'id' keyword is used instead of '$id' keyword to reference schemata, for example:

https://github.com/jhthorsen/json-validator/blob/8fedf04dd62be6c7d5533454a6fb06fdf482fa25/lib/JSON/Validator.pm#L360

Draft 07 mandates the usage of '$id'.

jhthorsen commented 6 years ago

I think we need to accept both id and $id, instead of just changing everything to $id. But I think the order should be:

$id = $schema->{'$id'} // $schema->{id} // ''; 
merkys commented 6 years ago

I see your point with backwards compatibility. Indeed id was replaced by $id in Draft 06 (changelog), therefore I support your proposition to accept both of them for now.

IMO this issue is a bug. Schemata corresponding to Draft 07 are not able to be correctly validated against.

jhthorsen commented 6 years ago

Unfortunately, it's a bit more complicated than this: If the incoming $id is relative, then we need to convert that into an absolute URL, relative to the parent.

Also...

Here is something I started on:

diff --git a/lib/JSON/Validator.pm b/lib/JSON/Validator.pm
index 160d1f2..b8e1ebc 100644
--- a/lib/JSON/Validator.pm
+++ b/lib/JSON/Validator.pm
@@ -41,6 +41,15 @@ sub S { Mojo::Util::md5_sum(Data::Dumper->new([@_])->Sortkeys(1)->Useqq(1)->Dump
 has cache_paths => sub { [split(/:/, $ENV{JSON_VALIDATOR_CACHE_PATH} || ''), $BUNDLED_CACHE_DIR] };
 has formats => sub { shift->_build_formats };

+sub version {
+  my $self = shift;
+  return $self->{version} // 4 unless @_;
+  $self->{version} = shift;
+  $self->{id_key} = $self->{version} < 7 ? 'id' : '$id';
+  warn "[JSON::Validator] Using version $self->{version}.\n" if DEBUG;
+  return $self;
+}
+
 has ua => sub {
   require Mojo::UserAgent;
   my $ua = Mojo::UserAgent->new;
@@ -172,7 +181,9 @@ sub joi {
 sub load_and_validate_schema {
   my ($self, $spec, $args) = @_;
   $spec = $self->_resolve($spec);
-  my @errors = $self->new(%$self)->schema($args->{schema} || SPECIFICATION_URL)->validate($spec);
+  my $schema = $args->{schema} || SPECIFICATION_URL;
+  $self->version($1) if $schema =~ /draft-0+(\w+)/;
+  my @errors = $self->new(%$self)->schema($schema)->validate($spec);
   confess join "\n", "Invalid JSON specification $spec:", map {"- $_"} @errors if @errors;
   $self->{schema} = Mojo::JSON::Pointer->new($spec);
   $self;
@@ -207,13 +218,14 @@ sub validate_json {

 sub _build_formats {
   return {
-    'date-time' => \&_is_date_time,
-    'email'     => \&_is_email,
-    'hostname'  => VALIDATE_HOSTNAME ? \&Data::Validate::Domain::is_domain : \&_is_domain,
-    'ipv4'      => VALIDATE_IP ? \&Data::Validate::IP::is_ipv4 : \&_is_ipv4,
-    'ipv6'      => VALIDATE_IP ? \&Data::Validate::IP::is_ipv6 : \&_is_ipv6,
-    'regex'     => \&_is_regex,
-    'uri'       => \&_is_uri,
+    'date-time'     => \&_is_date_time,
+    'email'         => \&_is_email,
+    'hostname'      => VALIDATE_HOSTNAME ? \&Data::Validate::Domain::is_domain : \&_is_domain,
+    'ipv4'          => VALIDATE_IP ? \&Data::Validate::IP::is_ipv4 : \&_is_ipv4,
+    'ipv6'          => VALIDATE_IP ? \&Data::Validate::IP::is_ipv6 : \&_is_ipv6,
+    'regex'         => \&_is_regex,
+    'uri'           => \&_is_uri,
+    'uri-reference' => \&_is_uri_reference,
   };
 }

@@ -354,13 +366,14 @@ sub _report_schema {
 # resolve all the $ref's that we find inside JSON Schema specification.
 sub _resolve {
   my ($self, $schema) = @_;
+  my $id_key = $self->{id_key} //= 'id';
   my ($id, $resolved, @refs);

   local $self->{level} = $self->{level} || 0;
   delete $_[0]->{schemas}{''} unless $self->{level};

   if (ref $schema eq 'HASH') {
-    $id = $schema->{id} // '';
+    $id = $schema->{$id_key} // '';
     return $resolved if $resolved = $self->{schemas}{$id};
   }
   elsif ($resolved = $self->{schemas}{$schema // ''}) {
@@ -368,11 +381,11 @@ sub _resolve {
   }
   else {
     ($schema, $id) = $self->_load_schema($schema);
-    $id = $schema->{id} if $schema->{id};
+    $id = $schema->{$id_key} if $schema->{$id_key};
   }

   unless ($self->{level}) {
-    my $rid = $schema->{id} // $id;
+    my $rid = $schema->{$id_key} // $id;
     if ($rid) {
       confess "Root schema cannot have a fragment in the 'id'. ($rid)" if $rid =~ /\#./;
       confess "Root schema cannot have a relative 'id'. ($rid)"
@@ -397,8 +410,8 @@ sub _resolve {
     elsif (UNIVERSAL::isa($topic, 'HASH')) {
       push @refs, [$topic, $base] and next if $topic->{'$ref'} and !ref $topic->{'$ref'};

-      if ($topic->{id} and !ref $topic->{id}) {
-        my $fqn = Mojo::URL->new($topic->{id});
+      if ($topic->{$id_key} and !ref $topic->{$id_key}) {
+        my $fqn = Mojo::URL->new($topic->{$id_key});
         $fqn = $fqn->to_abs($base) unless $fqn->is_abs;
         $self->_register_schema($topic, $fqn->to_string);
       }
@@ -773,7 +786,7 @@ sub _validate_type_object {
     # Special case used internally when validating schemas: This module adds "id"
     # on the top level which might conflict with very strict schemas, so we have to
     # remove it again unless there's a rule.
-    local $rules{id} = 1 if !$path and exists $data->{id};
+    local $rules{$self->{id_key}} = 1 if !$path and exists $data->{$self->{id_key}};

     if (my @k = grep { !$rules{$_} } @dkeys) {
       local $" = ', ';
@@ -978,6 +991,19 @@ sub _is_uri {
   return 1;
 }

+sub _is_uri_reference {
+  warn "_is_uri_reference @_";
+  return unless defined $_[0];
+  return unless $_[0] =~ m!^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?!;
+
+  my ($scheme, $auth_host, $path, $query, $fragment) = map { $_ // '' } ($2, $4, $5, $7, $9);
+
+  return _invalid('Path cannot not start with //.') if $path =~ m!^//!;
+  return 1 if length $path;
+  return _is_uri($_[0]);
+  return 1;
+}
+
 sub _merge_errors {
   join ' ', map {
     my $e = $_;
jhthorsen commented 6 years ago

I've tried to implement my suggestion above in bdff5805b457c75264bf0ff835009592c370d373

Could you try it out and see if that works for you?

cpanm https://github.com/jhthorsen/json-validator/archive/master.tar.gz
jhthorsen commented 6 years ago

Please re-open if this is not fixed.

I'm still not sure if JSON::Validator fully supports draft-07 (or even -06?), but I don't have time at the moment to dive in. Please open a new issue if there's some missing pieces.

merkys commented 6 years ago

This feature seems to work fine as of f34c6ffca88c191cefdb745bfc377e1789c84f70, thank you for a fix! I will let you know should I encounter other missing features of draft-07.