Exploring support for anyOf / oneOf

We currently don't support the oneOf or anyOf constructs in JSON schema.

We are using these in the Beneficial Ownership Data Standard, and they crop up when trying to re-use other schema contents such as GeoJSON.

We should identify whether flatten-tool can support these, or whether we have to document these as unsupported, and tailor schema design accordingly.

Worked example for discussion

As a simple example, with a schema, just to start thinking on this:

{
  "properties": {
    "food": {
      "oneOf": [
        {
          "type": "object",
          "title": "3-Course Menu",
          "properties": {
            "firstCourse": {
              "type": "string"
            },
            "mainCourse": {
              "type": "string"
            },
            "desert": {
              "type": "string"
            }
          }
        },
        {
          "type": "string"
        }
      ]
    }
  }
}

The following are valid JSON objects:

{
  "food":{
    "firstCourse":"Soup",
    "mainCourse":"Nut Roast",
    "desert":"Ice cream"
  } 
}

and

{
  "food":"Cheese"
}

These would flatten into the following tables:

food
Cheese

food/firstCourse	food/mainCourse	food/desert
Soup	Nut roast	Ice-cream

So what should a template look like?

It would likely need to include both options:

food	food/firstCourse	food/mainCourse	food/desert
	Soup	Nut roast	Ice-cream
Cheese

Which would of course cause errors if the user incorrectly input:

food	food/firstCourse	food/mainCourse	food/desert
Cheese	Soup	Nut roast	Ice-cream

as this would mean we have a string, and then an object in the same property.

Other scenarios

The more common pattern we might encounter would be a property that can contain multiple kinds of objects (as in the case of BODS, where an interestedParty might be an entityStatement, a personStatement or a nullStatement), or an array that can contain multiple kinds of objects.

I think in these cases we could just be writing out to a template all the objects, but would need to group them somehow (possibly an editorial job when curating templates), but would the have the risk still that if a user enters properties from a mix of the potential objects, we'll end up building an invalid object. Not sure how big a problem this necessarily is, as long as validation reports can pick it up effectively.

@timgdavies the problem we also have about anyOf and oneOf are the obscure validation error messages they give. i.e they just say something like {this whole object} is not valid in any subschema and the object can be huge.

I think we can support oneOf and anyOf but only for a subset of all the possible subschemas they can contain but not sure we should. Your examples above cover having "a string or an object" or "an object and another object", but they could have many more types of subschemas. So, for example, you could have a pathological case where we have an anyOf which contains: "A list of strings or a list of objects or a different list of objects or a string or an object or a different object or a number or a list of different types". So I do not think it will be possible to cover all cases anyOf or oneOf. So we need to decide a subset if we are to support them.

My instinct is not to support them unless they do not contain types. I think it is generally bad data modelling to allow two different types under the same field name. Its hard to put into a database and hard to use as a data analyst.

I would prefer just two separate fields (with distinct names) and a way to check that only one of them are used or any of them are used (and have preference for one if both). You could use this pattern to do the validation.

OpenDataServices / flatten-tool

Exploring support for anyOf / oneOf #182

Worked example for discussion

Other scenarios