WordPress / blueprints-library

32 stars 7 forks source link

JSON Validation Library #75

Closed adamziel closed 8 months ago

adamziel commented 8 months ago

Blueprints need to be validated against a JSON schema. Let's choose a PHP 7.0–compliant JSON Schema validation library.

Existing JSON Schema validation libraries for PHP

(skip to the bottom for the tests I ran on these libraries)

We can't use any of these libraries as it is, and there doesn't seem to be any major, well-tested library.

Here are the three things we can do:

Test results for each library

The next four comments discuss the limitations of each library when tested with the following data and schema. To make the code snippets work, you'll need to create these files in your filesystem:

data.json

{
    "name": "John Doe",
    "age": 31,
    "email": "john@example.com",
    "website": null,
    "location": {
        "country": "US",
        "address": "Sesame Street, no. 5"
    },
    "available_for_hire": true,
    "interests": ["php", "html", "css", "javascript", "programming", "web design"],
    "skills": [
        {
            "name": "HTML",
            "value": 100
        },
        {
            "name": "PHP",
            "value": 55
        },
        {
            "name": "CSS",
            "value": 99.5
        },
        {
            "name": "JavaScript",
            "value": 75
        }
    ]
}

schema.json

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "$id": "http://api.example.com/profile.json#",
  "type": "object",
  "properties": {
    "name": {
      "type": "string",
      "minLength": 1,
      "maxLength": 64,
      "pattern": "^[a-zA-Z0-9\\-]+(\\s[a-zA-Z0-9\\-]+)*$"
    },
    "age": {
      "type": "integer",
      "minimum": 18,
      "maximum": 100
    },
    "email": {
      "type": "string",
      "maxLength": 128,
      "format": "email"
    },
    "website": {
      "type": [
        "string",
        "null"
      ],
      "maxLength": 128,
      "format": "hostname"
    },
    "location": {
      "type": "object",
      "properties": {
        "country": {
          "enum": [
            "US",
            "CA",
            "GB"
          ]
        },
        "address": {
          "type": "string",
          "maxLength": 128
        }
      },
      "required": [
        "country",
        "address"
      ],
      "additionalProperties": false
    },
    "available_for_hire": {
      "type": "boolean"
    },
    "interests": {
      "type": "array",
      "minItems": 3,
      "maxItems": 100,
      "uniqueItems": true,
      "items": {
        "type": "string",
        "maxLength": 64
      }
    },
    "skills": {
      "type": "array",
      "maxItems": 100,
      "uniqueItems": true,
      "items": {
        "type": "object",
        "properties": {
          "name": {
            "type": "string",
            "minLenght": 1,
            "maxLength": 64
          },
          "value": {
            "type": "number",
            "minimum": 0,
            "maximum": 100,
            "multipleOf": 0.25
          }
        },
        "required": [
          "name",
          "value"
        ],
        "additionalProperties": false
      }
    }
  },
  "required": [
    "name",
    "age",
    "email",
    "location",
    "available_for_hire",
    "interests",
    "skills"
  ],
  "additionalProperties": false
}
adamziel commented 8 months ago

Opis v1 test

<?php

require __DIR__ . '/../vendor/autoload.php';

use Opis\JsonSchema\{
    Validator, ValidationResult, ValidationError, Schema
};

$data = json_decode( file_get_contents( 'data.json' ) );
$schema = Schema::fromJsonString( file_get_contents( 'schema.json' ) );

$validator = new Validator();

/** @var ValidationResult $result */
$result = $validator->schemaValidation( $data, $schema );

if ( $result->isValid() ) {
    echo '$data is valid', PHP_EOL;
} else {
    /** @var ValidationError $error */
    $error = $result->getFirstError();
    echo '$data is invalid', PHP_EOL;
    echo "Error: ", $error->keyword(), PHP_EOL;
    echo json_encode( $error->keywordArgs(), JSON_PRETTY_PRINT ), PHP_EOL;
}

Output on PHP 8:

Deprecated: filter_var(): Passing null to parameter #3 ($options) of type array|int is deprecated in vendor/opis/json-schema/src/Formats/AbstractFormat.php on line 43
$data is valid
adamziel commented 8 months ago

Opis v2 test

<?php

require __DIR__ . '/../vendor/opis/json-schema/autoload.php';

use Opis\JsonSchema\Validator;
use Opis\JsonSchema\ValidationResult;
use Opis\JsonSchema\Errors\ErrorFormatter;

$data = file_get_contents( 'data.json' )

// Create a new validator
$validator = new Validator();

// Register our schema
$validator->resolver()->registerFile(
    'http://api.example.com/profile.json',
    __DIR__ . '/schema.json'
);

// Decode $data
$data = json_decode( $data );

/** @var ValidationResult $result */
$result = $validator->validate( $data, 'http://api.example.com/profile.json' );

if ( $result->isValid() ) {
    echo "Valid", PHP_EOL;
} else {
    // Print errors
    print_r( ( new ErrorFormatter() )->format( $result->error() ) );
}

Output on PHP 7.0:

Parse error: syntax error, unexpected 'SchemaLoader' (T_STRING), expecting variable (T_VARIABLE) in vendor/opis/json-schema/src/Validator.php on line 27
adamziel commented 8 months ago

justinrainbow/json-schema

<?php

require __DIR__ . '/../vendor/autoload.php';

$data = json_decode( file_get_contents( __DIR__ . '/data.json' ) );

// Validate
$validator = new JsonSchema\Validator;
$validator->validate( $data, (object) [ '$ref' => 'file://' . realpath( __DIR__ . '/schema.json' ) ] );

if ( $validator->isValid() ) {
    echo "The supplied JSON validates against the schema.\n";
} else {
    echo "JSON does not validate. Violations:\n";
    foreach ( $validator->getErrors() as $error ) {
        printf( "[%s] %s\n", $error['property'], $error['message'] );
    }
}

Output on PHP 7.0 and 8.3:

The supplied JSON validates against the schema.

However, when I tried it with a different schema that uses discriminators, I got this error:

JSON does not validate. Violations:
[] Failed to match exactly one schema

The data was:

{
  "type": "Car",
  "brand": "Toyota",
  "model": "Corolla"
}

And the schema was:

{
  "type": "object",
  "required": [
    "type"
  ],
  "discriminator": {
    "propertyName": "type"
  },
  "oneOf": [
    {
      "additionalProperties": false,
      "required": [
        "brand",
        "model"
      ],
      "properties": {
        "type": {
          "const": "Car"
        },
        "brand": {
          "type": "string"
        },
        "model": {
          "type": "string"
        }
      }
    },
    {
      "additionalProperties": false,
      "required": [
        "brand",
        "gear"
      ],
      "properties": {
        "type": {
          "const": "Bike"
        },
        "brand": {
          "type": "string"
        },
        "gear": {
          "type": "integer",
          "minimum": 1
        }
      }
    }
  ]
}
adamziel commented 8 months ago

Swaggest

<?php

require __DIR__ . '/../vendor/autoload.php';

use Swaggest\JsonSchema\Schema;

$data = json_decode( '{
  "type": "Car",
  "brand": "Toyota",
  "model": "Corolla"
}' );
$data = json_decode( file_get_contents( __DIR__ . '/data.json' ) );
$schema = json_decode( file_get_contents( __DIR__ . '/simple_discriminator_schema.json' ) );
$schema = Schema::import( $schema );

$schema->in( $data ); // Throws if $data doesn't match $schema

It works for the original JSON data and schema posted in the issue, but it fails for the discriminator data/schema from the last comment:

Fatal error: Uncaught Swaggest\JsonSchema\Exception\LogicException: More than 1 valid result for oneOf: 2/2 valid results for oneOf
reimic commented 8 months ago

Is the validator more robust than the mapper? Are you certain reimplementing is not the way?

adamziel commented 8 months ago

@reimic they are two different tools:

adamziel commented 8 months ago

Using Opis v1.0 with PHP 8 compatibility patches

Opis v1 throw the following notice when ran on PHP 8.0:

Deprecated: filter_var(): Passing null to parameter #3 ($options) of type array|int is deprecated in vendor/opis/json-schema/src/Formats/AbstractFormat.php on line 43
$data is valid

Adding the following class to autoload -> files in composer.json fixed the problem:

{
  "autoload": {
     "files": ["src/Opis/JsonSchema/Formats/AbstractFormat.php"]
  }
}
<?php
namespace Opis\JsonSchema\Formats;

use Opis\JsonSchema\IFormat;

abstract class AbstractFormat implements IFormat {

    // ...

-   protected function validateFilter( $data, int $filter, $options = null ): bool {
+   protected function validateFilter( $data, int $filter, $options = 0 ): bool {
        return filter_var( $data, $filter, $options ) !== false;
    }

}

Opis v1 correctly handles the discriminator schema example, but the error structure it produces requires more massaging to be useful than the one produced by Opis v2:

// Opis v1:
$error = $result->getFirstError();
echo "Error: ", $error->keyword(), PHP_EOL;
var_dump( $error->subErrors()[1]->errorArgs() );
echo json_encode( $error->keywordArgs() ), PHP_EOL;

// Error: oneOf
// array(1) {
//  ["missing"]=>
//  string(4) "gear"
// }

// Opis v2:
print_r( ( new ErrorFormatter() )->format( $result->error() ) );
// [/] => Array
// (
//     [0] => The required properties (model) are missing
//     [1] => The required properties (gear) are missing
// )

It also doesn't seem like inferring the schema path the error relates to is trivial. Perhaps the data path would be good enough for developer-friendly error messages.

A deal breaker for Opis 1: incomplete support for default values and discriminators

When ran with the following data and schema, Opis 1 evaluates the oneOf options one by one. Unfortunately, when it evaluates the "Car" vehicle type, it mutates the stdClass object it validates and adds a model key with value test. This modification is not discarded afterwards. When Opis evaluates the "Bike" vehicle type, the validated object contains an extra model property.

Unfortunately, that's a deal breaker.

We could fork Opis 1 and spend some time on fixing the issue, but I don't want to do that when Opis 2 handles the same scenario flawlessly.

Therefore, I'm going to transpile Opis 2 to PHP 7.0 using Rector and adopt it as the Blueprint validation library.

adamziel commented 8 months ago

Solved in https://github.com/WordPress/blueprints-library/pull/76