serverlessworkflow / specification

Contains the official specification for the Serverless Workflow Domain Specific Language. It provides detailed guidelines and standards for defining, executing, and managing workflows in serverless environments, ensuring consistency and interoperability across implementations.
http://serverlessworkflow.io
Apache License 2.0
745 stars 147 forks source link

Schema format validation inconsistencies #1017

Closed cdavernas closed 1 month ago

cdavernas commented 1 month ago

What seems off:

Impossibility to consistently validate workflows using the v1.0.0-alpha3 schema due to validation inconsistencies and/or lack of format validation, resulting in multiple schema matches.

What you expected to be:

To properly and consistently validate (valid) workflows against the DSL schema

How to reproduce:

Given a simplified JSON schema:

"$schema": https://json-schema.org/draft/2020-12/schema
"$defs":
  runtimeExpression:
    type: string
    title: RuntimeExpression
    description: A runtime expression.
    pattern: "^\\s*\\$\\{.+\\}\\s*$"
  endpoint:
    title: Endpoint
    description: Represents an endpoint.
    oneOf:
    - "$ref": "#/$defs/runtimeExpression"
    - title: LiteralEndpoint
      type: string
      format: uri-template
    - title: EndpointConfiguration
      type: object
      unevaluatedProperties: false
      properties:
        uri:
          title: EndpointUri
          description: The endpoint's URI.
          oneOf:
          - "$ref": "#/$defs/runtimeExpression"
            title: ExpressionEndpointURI
            description: An expression based endpoint's URI.
          - title: LiteralEndpointURI
            description: The literal endpoint's URI.
            type: string
            format: uri-template
      required:
      - uri
additionalProperties: false
properties:
  address:
    "$ref": "#/$defs/endpoint"
required:
- address

It's impossible to consistently validate the following values, even though they are all supposed to be valid:

address: 'https://www.google.com/'
address: '${ .foobar }'
address: 'https://www.petstore.com/getById/{petId}'
Validator
https://www.google.com/
${ .foobar }
https://www.petstore.com/getById/{petId}
Non URI value
jschon.dev valid invalid valid valid
json-everything.net invalid valid invalid invalid
jsonschemavalidator.net valid valid valid valid

Proposed fix:

Remove the runtimeExpression and all its references from the schema, transforming oneOf alternatives that include it into plain string, formatless properties.

cdavernas commented 1 month ago

@ricardozanini @JBBianchi @matthias-pichler Wanna take a look/kick in before I start the PR?

JBBianchi commented 1 month ago

It's a shame to lose control over the formats, but I agree on the necessity of the proposal, as the validators can't seem to agree on what is what

ricardozanini commented 1 month ago

As discussed in private, +1

matthias-pichler commented 1 month ago

@cdavernas I reproduced your results .... firstly I transformed your YAML schema using https://onlineyamltools.com/convert-yaml-to-json

arriving at:

JSON Schema ```json { "$schema": "https://json-schema.org/draft/2020-12/schema", "$defs": { "runtimeExpression": { "type": "string", "title": "RuntimeExpression", "description": "A runtime expression.", "pattern": "^\\s*\\$\\{.+\\}\\s*$" }, "endpoint": { "title": "Endpoint", "description": "Represents an endpoint.", "oneOf": [ { "$ref": "#/$defs/runtimeExpression" }, { "title": "LiteralEndpoint", "type": "string", "format": "uri-template" }, { "title": "EndpointConfiguration", "type": "object", "unevaluatedProperties": false, "properties": { "uri": { "title": "EndpointUri", "description": "The endpoint's URI.", "oneOf": [ { "$ref": "#/$defs/runtimeExpression", "title": "ExpressionEndpointURI", "description": "An expression based endpoint's URI." }, { "title": "LiteralEndpointURI", "description": "The literal endpoint's URI.", "type": "string", "format": "uri-template" } ] } }, "required": [ "uri" ] } ] } }, "additionalProperties": false, "properties": { "address": { "$ref": "#/$defs/endpoint" } }, "required": [ "address" ] } ```
https://www.google.com/ ${ .foobar } https://www.petstore.com/getById/{petId} foo :&"!§ö
jschon.dev ❌ (matched uri-template & runtime expression)
json-everything.net (with "validate format")
jsonschemavalidator.net

From the looks of it the main problem does not seem to be the runtime expression regex but rather the implementation of format: uri-template in different validators.

  1. https://www.google.com/ should be valid against uri-template in json-everything.net but isn't
  2. ${ .foobar } shouldn't match uri-template in jschon.dev
  3. https://www.petstore.com/getById/{petId} should be valid against uri-template in json-everything.net but isn't

It might make more sense to replace format: uri-template with a plain type: string. Optionally we might add a regex such as: https://stackoverflow.com/a/61645285 or https://regex101.com/library/dL2vY6

matthias-pichler commented 1 month ago

So my proposed solution might be to use a "custom" uri-template type, such as:

$schema: https://json-schema.org/draft/2020-12/schema
$defs:
  runtimeExpression:
    type: string
    title: RuntimeExpression
    description: A runtime expression.
    pattern: "^\\s*\\$\\{.+\\}\\s*$"
  uriTemplate:
    title: UriTemplate
    anyOf:
      - title: LiteralUriTemplate
        type: string
        format: uri-template
        pattern: "^http(s?)://.*" # make uris absolute, honestly the only thing that makes sense
      - title: LiteralUri
        type: string
        format: uri
        pattern: "^http(s?)://.*" # make uris absolute, honestly the only thing that makes sense
  endpoint:
    title: Endpoint
    description: Represents an endpoint.
    oneOf:
      - $ref: "#/$defs/runtimeExpression"
      - $ref: "#/$defs/uriTemplate"
      - title: EndpointConfiguration
        type: object
        unevaluatedProperties: false
        properties:
          uri:
            title: EndpointUri
            description: The endpoint's URI.
            oneOf:
              - "$ref": "#/$defs/runtimeExpression"
                title: ExpressionEndpointURI
                description: An expression based endpoint's URI.
              - title: LiteralEndpointURI
                description: The literal endpoint's URI.
                type: string
                format: uri-template
        required:
          - uri
additionalProperties: false
properties:
  address:
    "$ref": "#/$defs/endpoint"
required:
  - address

this yields the following results:

https://www.google.com/ ${ .foobar } https://www.petstore.com/getById/{petId} foo :&"!§ö
jschon.dev
json-everything.net (with "validate format")
jsonschemavalidator.net