phax / phive

Generic business document validation engine
Apache License 2.0
38 stars 11 forks source link
business cii document ehf en16931 energie-efactuur oioubl openpeppol peppol peppol-bis peppol-validation-engine ph-bdve phive simplerinvoicing ubl validation-engine

PHIVE - Integrative Validation Engine

javadoc Maven Central

A generic business document validation engine originally developed for Peppol but now also supporting many other document types.

"phive" is an abbreviation of "Philip Helger Integrative Validation Engine" and is pronounced exactly like the digit 5: [ˈfaɪv].

This project only contains the validation engine - all the preconfigured rules are available in a separate repository at https://github.com/phax/phive-rules

This project is licensed under the Apache 2 license.

A live version of this engine can be found on Peppol Practical and at ecosio.

This project has the following sub-modules:

Note: this library does NOT include an EDIFACT validation. It's a placeholder for local implementations.

Note: please see README v5, README v6 and README v7/v8 for previous documentation.

Usage guide

Basically this library wraps different XML Schemas and Schematrons in a certain order and under certain constraints and offers the possibility to validate XML documents based on the rules.

Validation executor set identification

Every set of validation artefacts is uniquely identified based with a VESID. E.g. the "Peppol BIS Billing UBL Invoice release May 2023" is identified with the group ID eu.peppol.bis3, the artefact ID is invoice and the version number is 2023.5 (representing "May 2023") (without a classifier). Another example is "SimplerInvoicing 1.2 invoice" which has the group ID org.simplerinvoicing, the artifact ID invoice and the version number 1.2 (also without a classifier).

How to validate documents with programmatic rules

At least the phive-xml project and one library with rule sets (like e.g. phive-rules-peppol from https://github.com/phax/phive-rules) is needed in your application. See the section on usage in a Maven project below. All available VES must be registered in an instance of class ValidationExecutorSetRegistry (which can simply created via new). Depending on the used domain specific libraries, initialization calls for registration into the registry must be performed. Example for registering (only) Peppol validation artefacts:

final ValidationExecutorSetRegistry <IValidationSourceXML> aVESRegistry = new ValidationExecutorSetRegistry<> ();
PeppolValidation.initStandard (aVESRegistry);

The instance of class ValidationExecutorSetRegistry can be kept as a (static) singleton - it is thread-safe. Therefore the registration process need to be performed only once.

Validating a business document requires a few more steps.

  1. Access to the registry is needed.
  2. A specific VESID instance (e.g. PeppolValidation2023_05.VID_OPENPEPPOL_INVOICE_UBL_V3) - there are constants available for all VES identifiers defined in this project.
  3. The ValidationExecutionManager is an in-between class that can be used to customize the execution. But it is created very quickly, so there is no harm on creating it on the fly every time.
  4. An instance of class ValidationSourceXML to identify the document to be validate. Class ValidationSourceXML has factory methods for the default cases (having an org.w3c.dom.Node or having an com.helger.commons.io.resource.IReadableResource).
  5. The validation results are stored in an instance of class ValidationResultList. This class is a list of ValidationResult instances - each ValidationResult represents the result of a single level of validation.
  6. Your application logic than needs to define what to do with the results.
    // Resolve the VES ID
    final IValidationExecutorSet<IValidationSourceXML> aVES = aVESRegistry.getOfID (aVESID);
    if (aVES != null) {
      // What to validate?
      IValidationSourceXML aValidationSource = ...;

      // Build execution manager 
      final ValidationExecutionManager<IValidationSourceXML> aVEM = new ValidationExecutionManager<> (aVES);

      // Main execution of rules on validation source
      final ValidationResultList aValidationResult = aVEM.executeValidation (aValidationSource);
      if (aValidationResult.containsAtLeastOneError ()) {
        // errors found ...
      } else {
        // no errors (but maybe warnings) found ...
      }                                                                       
    }                                                                             

Since v6 the following simpler code can be used instead:

    // Resolve the VES ID
    final IValidationExecutorSet<IValidationSourceXML> aVES = aVESRegistry.getOfID (aVESID);
    if (aVES != null) {
      // What to validate?
      IValidationSourceXML aValidationSource = ...;

      // Shortcut introduced in v6
      final ValidationResultList aValidationResult = ValidationExecutionManager.executeValidation (aVES, aValidationSource);
      if (aValidationResult.containsAtLeastOneError ()) {
        // errors found ...
      } else {
        // no errors (but maybe warnings) found ...
      }                                                                       
    }                                                                             

How to validate documents with programmatic rules

TODO The description of this section needs to be written. Please have patience until everything is ready and setup.

Maven usage

Add the following to your pom.xml to use this artifact, replacing x.y.z with the latest version:

<dependency>
  <groupId>com.helger.phive</groupId>
  <artifactId>phive-xml</artifactId>
  <version>x.y.z</version>
</dependency>

If you are interested in the validation result transformation you need to also include this artefact.

<dependency>
  <groupId>com.helger.phive</groupId>
  <artifactId>phive-result</artifactId>
  <version>x.y.z</version>
</dependency>

Alternate usage as a Maven BOM:

<dependency>
  <groupId>com.helger.phive</groupId>
  <artifactId>phive-parent-pom</artifactId>
  <version>x.y.z</version>
  <type>pom</type>
  <scope>import</scope>
</dependency>

Potential issues

Please ensure that your stack size is at least 1MB (for Saxon). Using the Oracle runtime, this can be achieved by passing -Xss1m on the command line. This only seems to be a problem when running 32bit Java. With 64bit Java, the default stack size of the Oracle JVM is already 1MB.

News and noteworthy


My personal Coding Styleguide | It is appreciated if you star the GitHub project if you like it.