sinclairzx81 / typebox

Json Schema Type Builder with Static Type Resolution for TypeScript
Other
4.98k stars 157 forks source link

Support discriminated unions and Composite() over unions to support Ajv's removeAdditional #432

Closed felixfbecker closed 1 year ago

felixfbecker commented 1 year ago

One of the most important uses for JSON schema when building APIs is not only validation, but also sanitizing/stripping any properties that does not appear in the schema, so that no unintended properties are leaked or user input is passed through verbatim. This is possible with Ajv's removeAdditional: "all" option: https://ajv.js.org/guide/modifying-data.html#removing-additional-properties

However, when defining your schema with Typebox, that option currently does not work well due to a few limitations.

The problem with removeAdditional is that any properties for each schema are removed that are not defined in the directly adjacent properties declaration. This is problematic for anyOf and allOf schemas.

In the future, a removeUnevaluatedProperties option that is using the semantics of unevaluatedProperties (from the latest JSON schema draft) instead of additionalProperties would solve this, but Ajv doesn't yet have such an option.

But removeAdditional could also work with Typebox with a few changes.

Union types

For unions, removeAdditional works correctly when using discriminated unions using the discriminator field defined in the OpenAPI spec and oneOf instead of anyOf. I saw that Typebox already has an experimental factory function UnionOneOf(), but no official support.

It would be great to have an official factory function to create a discriminated union (something that TypeScript also has a concept of), either UnionOneOf() where the discriminator can be specified through schemaOptions or e.g. DiscriminatedUnion() with a first-class parameter for specifying the discriminator.

Intersection types

For intersections, removeAdditional works only if properties of all the intersected types are lifted into the parent type. This is actually exactly what Composite() does using KeyResolver, which is awesome!

But there is one limitation, which is that Composite() currently only accepts TObject[], which means it's not possible to compose unions of objects – even though the KeyResolver should handle this fine already under the hood! If that typing limitation could be lifted, Composite() could be a drop-in replacement for Intersect() (for objects).

Additionally, KeyResolver.Visit() would need to support visiting discriminated union members, i.e. iterate oneOf. Visit() is hardcoded and not customizable, so currently even if I try to define my own DiscriminatedUnion() type factory, I can't solve the additionalProperties problem entirely, because Composite() won't properly merge that custom union type.

If these two aspects were fixed, Typebox could be used with removeAdditional: "all".

sinclairzx81 commented 1 year ago

@felixfbecker Hi, sorry for the delay reply on this (have been fairly busy recently, trying to catch up!)

RemoveAddtional

I'm not sure about implementing much here as I'm keen to avoid adding specific types just to accommodate specific features of certain validators (specifically Ajv and its removeAdditional feature), I'm also keen to avoid types / schematics that TypeScript can't directly reason about (this with specific note to discriminator-fields of which TypeScript has no concept (other than through type narrowing using control flow analysis)) and of which JSON Schema has no concept.

You mentioned in the following

The problem with removeAdditional is that any properties for each schema are removed that are not defined in the directly adjacent properties declaration. This is problematic for anyOf and allOf schemas.

To me, this somewhat implies the removeAdditional feature in Ajv is incomplete (or at a minimum, is non-generalizable), as it forces particular schematic representations just to utilize the feature. Also, as it is dependent on the discriminator field to operate, that makes it a very hard sell to support (as this keyword is not known to the JSON Schema vocabulary)

UnionOneOf | DiscriminatedUnion

I do quite often get asked about providing a standard oneOf union representation. My reluctance to make this a standard type primarily stems from TS not having a way to reason about it, JSON Schema not having a concept of discriminators, as well as difficulties asserting that the oneOf type representation is logical.

Consider the following.

const T = Type.Union([Type.String(), Type.String()])      // logical: asserts string

const U = Type.UnionOneOf([Type.String(), Type.String()]) // illogical: will always be false

The problem with U is that multiple sub schemas will match when supplying a string, and the semantics of oneOf will result in validation failure (meaning U is illogical). Typically tho, oneOf is often used in tandem with implicit discriminator fields that prevent this multiple sub schema matching. Consider the following.

// The literal values prevent multiple oneOf sub schema match
const A = Type.Object({ type: Type.Literal('A'), value: Type.String() })
const B = Type.Object({ type: Type.Literal('B'), value: Type.String() })
const U = Type.UnionOneOf([A, B]) // logical: no potential for sub schema match

It's just that, by adding the implicit type discriminator, this makes anyOf work exactly the same as oneOf + the discriminator. (meaning we've not really achieved anything using the oneOf representation)

const A = Type.Object({ type: Type.Literal('A'), value: Type.String() })
const B = Type.Object({ type: Type.Literal('B'), value: Type.String() })
const U = Type.Union([A, B]) // fine

For explicit field discrimination however, (often used in tooling), because the discriminator is non-standard, we could just specify an additional non-standard property on the anyOf (or oneOf) representation to achieve the desired type.

const T = Type.Union([A, B], {  discriminator: 'type' })
// or
const T = Type.UnionOneOf([A, B], {  discriminator: 'type' }) 

It's at this point, we've sorta come full circle (with both types essentially equivalent). The example UnionOneOf is offered as a alternative to Union for users who really need the oneOf representation (there are cases where it's needed (if supporting particular tooling dependent on the keyword)), but it's unlikely to be promoted as a standard type in TypeBox given it's providing no additional value over anyOf

DiscriminatedUnion | JSON Type Definition

While I do certainly understand the need for type discriminator fields, it's difficult to implement them in a way that aligns with the JSON Schema specification (and TypeScript as based on a structural type system). However, there are other specifications available like JSON Type Definition (JTD) that do provide (and even mandate) discriminator fields (which is very useful when working against nominal type systems like C#, Rust, etc)

I've actually started work looking at JTD Schema support in TypeBox, and a starter implementation can be found in the examples here. At this stage I'm uncertain of the overall design (there's a lot of room for community feedback here) but the following implements discriminator fields using this specification. This specification should be supported in Ajv https://ajv.js.org/json-type-definition.html, and there is non-optimized validator support available in TypeBox (as below)

import { Type, Static } from './typedef/typedef'
import { Value } from '@sinclair/typebox/value'

// nominal type systems require named types for differentiation
const A = Type.Struct('A', { x: Type.Float32() })
const B = Type.Struct('B', { x: Type.Float32() })

// unions require additional type discrimination property
const T = Type.Union('type', [A, B])

const R1 = Value.Check(A, { x: 1 })
const R2 = Value.Check(B, { x: 2 })
const R3 = Value.Check(T, { type: 'A', x: 1 })
const R4 = Value.Check(T, { type: 'B', x: 1 })

References:

Summary

Hope this brings a bit of insight into some of the thinking this side with respect to supporting specific validator features, as well as thinking around UnionOneOf. At this stage, TypeBox supports the oneOf representation (either via example of inline Unsafe type), but probably won't support this representation as standard given the issues mentioned above. You also mentioned the new Composite type which I haven't gone into detail here, but there are upstream considerations around Composite which make changes to this type infeasible in the short term (but will be looking to generalize this type to support TSchema in future revisions).

Happy to discuss more if you have any thoughts on the above, so will leave this issue open for while. Again, apologies for the delay in reply. All the best! S

guysegal commented 1 year ago

I'm a bit confused.. is there a non standard way to use something like UnionOneOf with the published package?

sinclairzx81 commented 1 year ago

@guysegal Hi.

Yes, there are numerous ways to express oneOf with the published package. One of the easiest is via Type.Unsafe (if you're using Ajv). The following uses Type.Unsafe to construct the type, with TUnionOneOf handling static inference. This should work with Ajv (and other tools that understand the union discriminator property)

import { Type, Static, TSchema, SchemaOptions, Assert } from '@sinclair/typebox'

export type TUnionOneOf<T extends TSchema[]> = T extends [infer L, ...infer R] 
  ? Static<Assert<L, TSchema>> | TUnionOneOf<Assert<R, TSchema[]>> 
  : never

export const UnionOneOf = <T extends TSchema[]>(oneOf: [...T], options: SchemaOptions = {}) => 
  Type.Unsafe<TUnionOneOf<T>>({ ...options, oneOf })

// --------------------------------------------------------------------
// Usage
// --------------------------------------------------------------------

const A = Type.Object({ type: Type.Literal('A'), value: Type.String() })
const B = Type.Object({ type: Type.Literal('B'), value: Type.String() })

const T = UnionOneOf([A, B], { discriminator: 'type' }) // const T = {
                                                        //   discriminator: 'type',
                                                        //   oneOf: [{
                                                        //     type: 'object',
                                                        //     required: [ 'type', 'value' ]
                                                        //     properties: {
                                                        //       type: { const: 'A', type: 'string' },
                                                        //       value: { type: 'string' }
                                                        //     },
                                                        //   }, {
                                                        //      type: 'object',
                                                        //      required: [ 'type', 'value' ],
                                                        //      properties: {
                                                        //        type: { const: 'B', type: 'string' },
                                                        //        value: { type: 'string' }
                                                        //      },
                                                        //    }]
                                                        // }

type T = Static<typeof T>                               // type T = {
                                                        //   type: "A";
                                                        //   value: string;
                                                        // } | {
                                                        //   type: "B";
                                                        //   value: string;
                                                        // }

There are ways to get UnionOneOf working with the TB compiler, refer to the example here for a more complete implementation that extends the TB Type Builder and implements the validation logic.

Cheers S

sinclairzx81 commented 1 year ago

@felixfbecker Heya, going to close up this issue, but happy to continue to discuss this functionality on this thread.

At this time, still going to hold off on a oneOf implementation, and there are some future plans around composite, but will drop a couple of tags to revisit at a late time.

All the best S

felixfbecker commented 1 year ago

@sinclairzx81 Thanks for the detailed response! I see your argument that discriminated unions are not in the core JSON schema standard and that oneOf is not native to TypeScript (more "automatic").

It's just really unfortunate because making removeAdditional work is such an important feature for being able to use Typebox in production for API schema definition.

Defining a manual helper in userland for UnionOneOf() and simply passing discriminator in schemaOptions would not be a problem at all, if it wasn't for Composite() ignoring oneOf.

Could it maybe be possible to just add traversal of oneOf to the KeyResolver used by Composite(), or to maybe allow the customization of the KeyResolver through a registry like done for some other components of Typebox (without adding the type creator function)? After all it is a popular native keyword to JSON schema.

I tried to patch it together support for it, but I just couldn't figure out how make the types work so that my custom DiscriminatedUnionOneOf() returns an object type that Composite() will accept and merge correctly like an intersection type. Do you have any idea how this could work?

FabianFrank commented 1 year ago

@sinclairzx81 Since this is a real limitation for @felixfbecker and myself (we are using TypeBox generated schemas for request/response validation and stripping of extra properties in a REST API) we have implemented a Type.DiscriminatedUnion(discriminator, members) that produces a oneOf JSON schema with the discriminator property set. I think it bridges an important gap between JSON schema and TypeScript, that is a common challenge when building APIs with runtime type validation. If you look around various libraries like TypeBox or Zod or json-schema-to-ts a common issue is how to map (discriminated) TS unions to JSON schema and back. Have your plans on this evolved in the last few months and is there any interest in upstreaming this, maybe as part of example/experimental?

sinclairzx81 commented 1 year ago

@FabianFrank Hi,

The request for oneOf is generally in the back of my mind (as I do get asked about it a lot) but still not something I can introduce into TB at this time as there are still many challenges expressing multiple schematics that all represent the same concept (i.e. union) (and where composition logic is all tied to anyOf (which is very far reaching and tuned to the semantics of TS unions)). It's a similar case for string enum (where I get asked about this representation also).

In saying this, I have actually been exploring DiscriminatedUnion as part of the JSON Type Definition specification which was mentioned on a previous comment (as this specification has explicit schematics for discriminated structures) You can find the current TypeBox JTD implementation at the links below

Documentation

https://github.com/sinclairzx81/typebox/tree/master/example/typedef#unions

Implementation

https://github.com/sinclairzx81/typebox/blob/master/example/typedef/typedef.ts#L66-L71

I am slowly working towards bringing this specification to TypeBox under @sinclair/typebox/typedef (which would form the basis of TB being able to create multiple schema specifications under the same type system), and by consequence of this (along with the infrastructure to make this work with the validators) should open up the door to being able to express varying representations for union (making room for oneOf and string enum), but all this is still a fair way off, it needs a lot of thought and planning to get right.

For now, I'm open to accepting PR's for experimental types if you would like to submit one (and I do maintain these through TB revisions) as they are all candidates for possible inclusion into the library one day.

Hope that brings an update to where things are at :) All the best S