romaricdrigon / MetaYaml

A powerful schema validator!
MIT License
104 stars 21 forks source link
json schema schema-validation xml yaml

MetaYaml

Latest Stable Version Build Status SensioLabsInsight

A [put your file type here] schema validator using [put another file type here] files.
At the moment, file type can be Json, Yaml, or XML. It can generate a documentation about the schema, or a XSD file (experimental).

The name comes from the fact that it was initially made to implement a pseudo-schema for Yaml files.

  1. Installation
  2. Basic usage
  3. How to write a schema
  4. Documentation generator
  5. Notes on XML support
  6. XSD generator
  7. Test
  8. Extending
  9. Thanks

Installation

It is a standalone component:

To install via composer just do composer require romaricdrigon/metayaml

Basic usage

You have to create a MetaYaml object, and then pass it both the schema and your data as multidimensional php arrays:

use RomaricDrigon\MetaYaml\MetaYaml;

// create object, load schema from an array
$schema = new MetaYaml($schema);

/*
    you can optionally validate the schema
    it can take some time (up to a second for a few hundred lines)
    so do it only once, and maybe only in development!
*/
$schema->validateSchema(); // return true or throw an exception

// you could also have done this at init
$schema = new MetaYaml($schema, true); // will load AND validate the schema

// finally, validate your data array according to the schema
$schema->validate($data); // return true or throw an exception

You can use any of the provided loaders to obtain these arrays (yep, you can validate XML using a schema from an Yaml file!).

Some loader examples:

use RomaricDrigon\MetaYaml\MetaYaml;
use RomaricDrigon\MetaYaml\Loader\YamlLoader;
use RomaricDrigon\MetaYaml\Loader\XmlLoader; // JsonLoader is already available

// create one loader object
$loader = new JsonLoader(); // Json (will use php json_decode)
$loader = new YamlLoader(); // Yaml using Symfony Yaml component
$loader = new XmlLoader(); // Xml (using php SimpleXml)

// the usage is the same then
$array = $loader->load('SOME STRING...');
// or you can load from a file
$array = $loader->loadFromFile('path/to/file');

How to write a schema

Introduction

A schema file will define the array structure (which elements are allowed, where), some attributes (required, can be empty...) and the possible values for these elements (or their type).

Here's a simple example of a schema, using Yaml syntax:

root: # root is always required (note no prefix here)
    _type: array # each element must always have a '_type'
    _children: # array nodes have a '_children' node, defining their children
        flowers:
            _type: array
            _required: true # optional, default false
            _children:
                rose:
                    _required: true
                    _type: text
                violet:
                    _type: text
                # -> only rose and violet are allowed children of flowers

And a valid Yaml file :

flowers:
    rose: "a rose"
    violet: "a violet flower"

We will continue with Yaml examples; if you're not familiar with the syntax, you may want to take a look at its Wikipedia page. Of course the same structure is possible with Json or XML, because the core is the same. Take a look at examples in test/data/ folder.

Schema structure

A schema file must have a root node, which will describe the first-level content. You can optionally define a prefix; by default it is _ (_type, _required...).

You have to define a partials node if you want to use this feature (learn more about it below).

A basic schema file:

root:
    # here put the elements who will be in the file
    # note that root can have any type: an array, a number, a prototype...
prefix: my_ # so it's gonna be 'my_type', 'my_required', 'my_children'...
partials:
    block:
        # here I define a partial called block

Schema nodes

Each node in the schema must have a _type attribute. Here I define a node called paragraph whose content is some text:

paragraph:
    _type: text

Those types are available:

You can specify additional attributes:

Here's a comprehensive example:

root:
    _type: array
    _children:
        SomeText:
            _type: text
            _not_empty: true # so !== ''
        SomeEnum:
            _type: enum
            _values:
                - windows
                - mac
                - linux
        SomeNumber:
            _type: number
            _strict: true
        SomeBool:
            _type: boolean
        SomePrototypeArray:
            _type: prototype
            _prototype:
                _type: array
                _children:
                    SomeOtherText:
                        _type: text
                        _is_required: true # can't be null
        SomeParagraph:
            _type: partial
            _partial: aBlock # cf 'partials' below
        SomeChoice:
            _type: choice
            _choices:
                1:
                    _type: enum
                    _values:
                        - windows
                        - linux
                2:
                    _type: number
                # so our node must be either #1 or #2
        SomeRegex:
            _type: pattern
            _pattern: /e/
partials:
    aBlock:
        _type: array
        _children:
            Line1:
                _type: text

More information

For more examples, look inside test/data folder. In each folder, you have an .yml file and its schema. There's also a XML example.

If you're curious about an advanced usage, you can check data/MetaSchema.json: schema files are validated using this schema (an yep, the schema validates successfully itself!)

Documentation generator

Each node can have a _description attribute, containing some human-readable text. You can retrieve the documentation about a node (its type, description, other attributes...) like this:

// it's recommended to validate the schema before reading documentation
$schema = new MetaYaml($schema, true);

// get documentation about root node
$schema->getDocumentationForNode();

// get documentation about a child node 'test' in an array 'a_test' under root
$schema->getDocumentationForNode(array('a_test', 'test'));

// finally, if you want to unfold (follow) all partials, set second argument to true
$schema->getDocumentationForNode(array('a_test', 'test'), true);
// watch out there's no loop inside partials!

It returns an associative array formatted like this:

array(
    'name' => 'test', // name of current node, root for first node
    'node' => array(
        '_type' => 'array',
        '_children' => ... // and so on
    ),
    'prefix' => '_'
)

If the targeted node is inside a choice, the result will differ slightly:

array(
    'name' => 'test', // name of current node, from the choice key in the schema
    'node' => array(
        '_is_choice' => 'true', // important: so we know next keys are choices
        0 => array(
            '_type' => 'array' // and so on, for first choice
        ),
        1 => array(
            '_type' => 'text' // and so on, for second choice
        ),
        // ...
    ),
    'prefix' => '_'
)

This behavior allow us to handle imbricated choices, without loosing data (you have an array level for each choice level, and you can check the flag _is_choice)

If you pass an invalid path (e.g. no node with the name you gave exist), it will throw an exception.

Notes on XML support

In XML, you can store a value in a node within a child element, or using an attribute. This is not possible in an array; the only way is to use a child.

Thus, the following conventions are enforced by the XML loader:

Let's take an example:

<fleurs>
    <roses couleur="rose">
        <opera>une rose</opera>
        <sauvage>
            <des_bois>une autre rose</des_bois>
            <des_sous_bois sauvage="oui">encore</des_sous_bois>
        </sauvage>
    </roses>
    <tulipe>je vais disparaitre !</tulipe>
    <tulipe>deuxieme tulipe</tulipe>
    <fleur couleur="violette" sauvage="false" _key="violette">une violette</fleur>
</fleurs>

will give us this array:

array('fleurs' =>
    'roses' => array(
        'couleur' => 'rose',
        'sauvage' => array(
            'des_bois' => 'une autre rose',
            'des_sous_bois' => array(
                'sauvage' => 'oui',
                '_value' => 'encore'
            )
        )
    ),
    'tulipe' => 'deuxieme tulipe',
    'violette' => array(
        'couleur' => 'violette',
        'sauvage' => 'false',
        '_value' => 'une violette'
    )
)

XSD generator

Please note this feature is still experimental!

MetaYaml can try to generate a XML Schema Definition from a MetaYaml schema. You may want to use this file to pre-validate XML input, or to use in another context (client-side...). The same conventions (cf. above) will be used.

Usage example :

use RomaricDrigon\MetaYaml\MetaYaml\XsdGenerator;

// create a XsdGenerator object (requires Php XMLWriter from libxml, enabled by default)
$generator = new XsdGenerator();

// $schema is the source schema, php array
// second parameter to soft-indent generated XML (default true)
$my_xsd_string = $generator->build($schema, true);

A few limitations, some relative to XML Schema, apply:

Test

The project is fully tested using atoum. To launch tests, just run in a shell ./bin/test -d test

Extending

You may want to write your own loader, using anything else.
Take a look at any class in Loader/ folder, it's pretty easy: you have to implement the LoaderInterface, and may want to extend Loader class (so you don't have to write loadFromFile()).

Thanks

Thanks to Riad Benguella and Julien Bianchi for their help & advice.

Top