Open timgdavies opened 6 years ago
@timgdavies the problem we also have about anyOf and oneOf are the obscure validation error messages they give. i.e they just say something like {this whole object} is not valid in any subschema
and the object can be huge.
I think we can support oneOf and anyOf but only for a subset of all the possible subschemas they can contain but not sure we should. Your examples above cover having "a string or an object" or "an object and another object", but they could have many more types of subschemas. So, for example, you could have a pathological case where we have an anyOf which contains: "A list of strings or a list of objects or a different list of objects or a string or an object or a different object or a number or a list of different types". So I do not think it will be possible to cover all cases anyOf or oneOf. So we need to decide a subset if we are to support them.
My instinct is not to support them unless they do not contain types. I think it is generally bad data modelling to allow two different types under the same field name. Its hard to put into a database and hard to use as a data analyst.
I would prefer just two separate fields (with distinct names) and a way to check that only one of them are used or any of them are used (and have preference for one if both). You could use this pattern to do the validation.
We currently don't support the
oneOf
oranyOf
constructs in JSON schema.We are using these in the Beneficial Ownership Data Standard, and they crop up when trying to re-use other schema contents such as GeoJSON.
We should identify whether flatten-tool can support these, or whether we have to document these as unsupported, and tailor schema design accordingly.
Worked example for discussion
As a simple example, with a schema, just to start thinking on this:
The following are valid JSON objects:
and
These would flatten into the following tables:
or
So what should a template look like?
It would likely need to include both options:
Which would of course cause errors if the user incorrectly input:
as this would mean we have a string, and then an object in the same property.
Other scenarios
The more common pattern we might encounter would be a property that can contain multiple kinds of objects (as in the case of BODS, where an
interestedParty
might be anentityStatement
, apersonStatement
or anullStatement
), or an array that can contain multiple kinds of objects.I think in these cases we could just be writing out to a template all the objects, but would need to group them somehow (possibly an editorial job when curating templates), but would the have the risk still that if a user enters properties from a mix of the potential objects, we'll end up building an invalid object. Not sure how big a problem this necessarily is, as long as validation reports can pick it up effectively.