jddf / spec

The JSON Data Definition Format specification and official test suite
2 stars 1 forks source link

[PROPOSAL] union without tags #7

Closed epoberezkin closed 4 years ago

epoberezkin commented 4 years ago

why no JSL for schema itself?

Are you asking why there's no "meta-schema"? Because it's not very necessary. For the purposes of writing a formal specification, CDDL is a much more powerful tool, and is already formalized as an RFC, unlike any other options.

epoberezkin commented 4 years ago

I disagree. Implementations should be able to validate schemas one way or another. Not having meta-schema would lead to the need to have a separate implementation for validating schema documents, which seems absurd to me. The addition I propose is super simple and it would include schemas into the scope of the supported JSON documents, that can be validated, marshalled, unmarshalled and converted to types in all languages that have some way to express polymorphism (most, if not all modern languages).

The proposal is to add one more form "forms". It should be limited to defining multiple shapes of structs, without either reducing it to "discriminator" simplicity (which is not what people usually do for polymorphism) or extending it to JSON Schema oneOf complexity and ambiguity. "forms" will be a map of schemas, where keys have informational meaning only (and if one wants to reference these schemas, they should put them inside definitions), and it should require that all internal schemas have "properties" form to simplify and remove any ambiguity and complexity of matching data to any given form.

Without writing a formal definition (you're much better at it:), I can show on a simple example:

{
  "forms": {
    "foo": {
      "properties": {
        "foo": {}
      }
    },
    "bar": {
      "properties": {
        "bar": {}
      }
    }
  }
}

The above schema would allow objects that have "foo" property and objects that have "bar" property. It is not uncommon type/data design pattern.

With "forms" the meta-schema for the current JDDF would be:

{
  "definitions": {
    "def: properties": {
      "properties": {
        "properties": {
          "values": {"ref": "hmmm... self? or just empty string? it's commonly needed"}
        }
      },
      "optionalProperties": {
        "optionalProperties": {
          "values": {"ref": "self"}
        }
      }
    },
    "def: optionalProperties": {
      "properties": {
        "optionalProperties": {
          "values": {"ref": "self"}
        }
      },
      "optionalProperties": {
        "properties": {
          "values": {"ref": "self"}
        }
      }
    },
    "form: properties": {
      "forms": {
        "properties": {"ref": "def: properties"},
        "optionalProperties": {"ref": "def: optionalProperties"}
      }
    }
  },

  "forms": {
    "form: empty": {
      "properties": {}
    },
    "form: ref": {
      "properties": {
        "ref": {"type": "string"}
      }
    },
    "form: type": {
      "properties": {
        "type": {"enum": ["string", "number", "etc"]}
      }
    },
    "form: enum": {
      "elements": {}
    },
    "form: elements": {
      "ref": "self"
    },
    "form: properties": {
      "ref": "form: properties"
    },
    "form: values": {
       "ref": "self"
    },
    "form: discriminator": {
      "properties": {
        "tag": {"type": "string"},
        "mapping": {
          "values": {"ref": "form: properties"}
        }
      }
    },
    "form: forms": {
      "properties": {
        "forms": {
          "values": {"ref": "form: properties"}
        }
      }
    }
  }
}

There will be another proposal to replace "type" form with e.g. "value" form (that is the term used by JSON for scalar values, obviously it is easily confused with "values", not sure how best to reconcile - maybe "scalar") for scalar values and make "type" required property for "value" form and optional property in other forms. The meta-schema above is for the current JDDF + "forms" form.

That is it, now one can validate polymorphic structs and JDDF using "forms" form.

What do you think?

epoberezkin commented 4 years ago

This meta-schema requires some adaptation to only allow definitions on the top level, but it is quite straightforward.

epoberezkin commented 4 years ago

Re #5 - do we think we need a union without tag? When I spoke with one developer about JDDF the first question was - how do you do [untagged] union? That seems like a pattern people want to have but in a more manageable way than oneOf and not limited to discriminator.

Tagged union cannot always be used instead, as it’s not natural when types in a union are very different (like in a case of JSDF schema forms) or pre-existing - tags are usually used for different kinds of the same thing in a union.

ucarion commented 4 years ago

I think this ticket is mis-named. We're not talking about untagged unions -- what in TypeScript looks like string | number. We're talking about some forms keyword that lets you do something like discriminator, but using the set of properties in the data, instead of the value of a particular "tag" property.

My two initial challenges to this proposal are:

1) I'm not sure I know of any precedent for this sort of stuff in any type systems I know of. Can you map this concept of "discriminate-on-what-properties-are-there" to any type systems?

2) In the examples you've given so far, there are no cases where any of the different forms have different types for the same property. That's pretty common in my experience: even when you're relying on the presence of different properties to decide what kind of data you're dealing with, some foo property will be the same type regardless of what other properties are or are not present.

So long as that's the case, your example:

{
  "forms": {
    "foo": {
      "properties": {
        "foo": {}
      }
    },
    "bar": {
      "properties": {
        "bar": {}
      }
    }
  }
}

Can at least be represented as:

{
  "properties": {
    "foo": {},
    "bar": {},
  }
}

And will have pretty adequate code generation equivalents in lots of languages. If you have combinations-of-keywords semantics in your application, you'd have to implement those by hand. I don't view this as a major missing feature -- there are always gonna be structural patterns which JDDF can't support.

epoberezkin commented 4 years ago

I think this ticket is mis-named. We're not talking about untagged unions -- what in TypeScript looks like string | number. We're talking about some forms keyword that lets you do something like discriminator, but using the set of properties in the data, instead of the value of a particular "tag" property.

I agree that we should consider a union instead, rather than matching on the properties that are present. Most languages have some way to support unions, at least via untyped pointers or interfaces, so defining a property (or other value) that can be one of several types is not uncommon. Should we open another issue maybe?

epoberezkin commented 4 years ago

Closed, see #27 instead