Applying refinements + general advice on my implementation

ShlokDesai33 commented 4 months ago

Hey there! Firstly, thank you so much for creating and maintaining this library. I understand that this is kind of a low level schema validation library. However, my use case requires refinements (https://zod.dev/?id=refine). I tried creating a small wrapper to support my use case:

import type { Static, TSchema } from "@sinclair/typebox";
import { ValidationError } from "./errors.server";
import {
  type TypeCheck,
  type ValueErrorType,
  TypeCompiler,
} from "@sinclair/typebox/compiler";

type Refinements<K> = {
  [x in keyof K]?: (val: K[x]) => string;
};

type ErrorMap = {
  [x in ValueErrorType]?: string;
};

type Errors<K> = {
  [x in keyof K]: string;
};

export class Validator<T extends TSchema> {
  typeChecker: TypeCheck<T>;

  constructor(
    readonly schema: T,
    public refinements: Refinements<Static<T>> = {}
  ) {
    this.typeChecker = TypeCompiler.Compile(schema);
  }

  parse(data: unknown, errorMap: ErrorMap = {}) {
    const errors = [...this.typeChecker.Errors(data)]
      .map((e) => {
        return errorMap[e.type]
          ? { ...e, message: errorMap[e.type] }
          : { ...e };
      })
      .reduce(
        (acc, { path, message }) => ({
          ...acc,
          [path.slice(1).split("/").join(".")]: message,
        }),
        {}
      ) as Errors<Static<T>>;

    const keys = Object.keys(this.refinements) as [keyof Static<T>];

    keys.forEach((key) => {
      if (!errors[key]) {
        // @ts-expect-error this will never error because we're looping through
        // this.refinement's keys
        errors[key] = this.refinements[key]((data as Static<T>)[key]);
      }
    });

    if (Object.keys(errors).length === 0) {
      return data as Static<T>;
    } else throw new ValidationError(errors);
  }
}

I'm not entirely satisfied with it - I've had to perform a whole lot of typescript gymnastics to get it working. Is there an easier way to go about doing this?

sinclairzx81 commented 4 months ago

@ShlokDesai33 Hi!

Is there an easier way to go about doing this?

There's a few ways to implement a Zod refine function in TypeBox, but probably the easiest way is to use a combination of Transform + Decode. I've setup a quick example below that implements both Parse and Refine using various TB functions.

import { Type, TSchema, TTransform, StaticDecode, StaticEncode } from '@sinclair/typebox'
import { Value } from '@sinclair/typebox/value'

// ------------------------------------------------------------------
// Parse
// ------------------------------------------------------------------
export function Parse<T extends TSchema, R = StaticDecode<T>>(schema: T, value: unknown): R {
  const defaulted = Value.Default(schema, value)
  const converted = Value.Convert(schema, defaulted)
  const cleaned = Value.Clean(schema, converted)
  return Value.Decode(schema, cleaned)
}

// ------------------------------------------------------------------
// Refine
// ------------------------------------------------------------------
export type RefineFunction<T extends TSchema> = (value: StaticEncode<T>) => boolean
export type RefineOptions = { message?: string }

export function Refine<T extends TSchema, E = StaticEncode<T>>(schema: T, refine: RefineFunction<T>, options: RefineOptions = {}): TTransform<T, E> {
  const Throw = (options: RefineOptions): never => { throw new Error(options.message ?? 'Refine check failed') }
  const Assert = (value: E): E => refine(value) ? value : Throw(options)
  return Type.Transform(schema).Decode(value => Assert(value as E)).Encode(value => Assert(value))
}

// ------------------------------------------------------------------
// Usage
// ------------------------------------------------------------------
// https://zod.dev/?id=refine
//
// const myString = z.string().refine((val) => val.length <= 255, {
//   message: "String can't be more than 255 characters",
// });

const T = Refine(Type.String(), value => value.length <= 255, {
  message: "String can't be more than 255 characters"
})

try {
  const X = Parse(T, ''.padEnd(255))  // Ok
  const Y = Parse(T, ''.padEnd(256))  // Fail
} catch(error) {
  console.log(error)
}

The above should be a fairly close approximation of Zod's refine function. I've used the Value.* submodule in the example, but if you need a JIT compiled Parse function, the following should achieve this.

export function CompileParse<T extends TSchema, R = StaticDecode<T>>(schema: T) {
  const check = TypeCompiler.Compile(schema)
  return (value: unknown): R => {
    const defaulted = Value.Default(schema, value)
    const converted = Value.Convert(schema, defaulted)
    const cleaned = Value.Clean(schema, converted)
    return check.Decode(cleaned) as R
  }
}

Hope this helps! S

ShlokDesai33 commented 4 months ago

This helps a ton! Is there a way to collect all the errors (including the error thrown by Refine) in the schema? Something like:

const errors = [...this.typeChecker.Errors(data)]

sinclairzx81 commented 4 months ago

@ShlokDesai33 Heya

Does the refine function only work with the Parse function you created? I tried to test it, but it's not throwing errors when I validate the schema using Check().

Yes, that's correct.

The Check() function will only check a value against the schematic. It applies no additional runtime processing of a value.
The Decode() function will internally Check() the value, then it applies additional Transform logic to that value.

The Refine() function is dependent on Decode() to run Transform logic against the value. If you use the example provided, you will need to ensure all your values are run through Parse() and not Check().

This helps a ton! Is there a way to collect all the errors (including the error thrown by Refine) in the schema? Something like:

Here's an update to obtain all the errors. You can access them on the error.errors property.

import { Type, TSchema, TTransform, StaticDecode, StaticEncode } from '@sinclair/typebox'
import { Value, ValueError, TransformDecodeError } from '@sinclair/typebox/value'

// ------------------------------------------------------------------
// Parse
// ------------------------------------------------------------------
export class ParseError extends Error {
  constructor(message: string, public errors: ValueError[]) {
    super(message)
  }
}
export function Parse<T extends TSchema, R = StaticDecode<T>>(schema: T, value: unknown): R {
  const defaulted = Value.Default(schema, value)
  const converted = Value.Convert(schema, defaulted)
  const cleaned = Value.Clean(schema, converted)
  try {
    return Value.Decode(schema, cleaned)
  } catch(error) {
    return error instanceof TransformDecodeError
      ? (() => { throw new ParseError(error.message, []) })()
      : (() => { throw new ParseError('Schema', [...Value.Errors(schema, value)]) })()
  }
}
// ------------------------------------------------------------------
// Refine
// ------------------------------------------------------------------
export type RefineFunction<T extends TSchema> = (value: StaticEncode<T>) => boolean
export type RefineOptions = { message?: string }

export function Refine<T extends TSchema, E = StaticEncode<T>>(schema: T, refine: RefineFunction<T>, options: RefineOptions = {}): TTransform<T, E> {
  const Throw = (options: RefineOptions): never => { throw new Error(options.message ?? 'Refine check failed') }
  const Assert = (value: E): E => refine(value) ? value : Throw(options)
  return Type.Transform(schema).Decode(value => Assert(value as E)).Encode(value => Assert(value))
}
// ------------------------------------------------------------------
// Usage
// ------------------------------------------------------------------
const T = Refine(Type.String(), value => value.length <= 255, {
  message: "String can't be more than 255 characters"
})

try {
  const X = Parse(T, ''.padEnd(255))  // Ok
  const Y = Parse(T, ''.padEnd(256))  // Refine Error
  const Z = Parse(T, [])              // Schema Error
} catch(error: any) {
  console.log(error)
}

Just be mindful that some types may generate a large amount of errors. You may wish to limit the number of errors generated by explicitly enumerating the iterator returned from Value.Errors() up to some finite amount. This helps to prevent excessive buffering.

Again, hope this helps! S

ShlokDesai33 commented 4 months ago

Got it! Is there any way to combine the TransformDecodeError with the rest of errors? I also don't want an invalid schema to prevent decoding a value. I'm mainly using this library for form validation, and I'd like to show the user as many errors as possible (so they don't have to keep submitting the form).

sinclairzx81 commented 4 months ago

Got it! Is there any way to combine the TransformDecodeError with the rest of errors? I also don't want an invalid schema to prevent decoding a value. I'm mainly using this library for form validation, and I'd like to show the user as many errors as possible (so they don't have to keep submitting the form).

Possibly the following....

import { Type, TSchema, TTransform, StaticDecode, StaticEncode } from '@sinclair/typebox'
import { Value, ValueError, ValueErrorType, TransformDecodeError } from '@sinclair/typebox/value'

// ------------------------------------------------------------------
// Parse
// ------------------------------------------------------------------
export class ParseError extends Error {
  constructor(public readonly errors: ValueError[]) {
    super()
  }
}
export function Parse<T extends TSchema, R = StaticDecode<T>>(schema: T, value: unknown): R {
  const defaulted = Value.Default(schema, value)
  const converted = Value.Convert(schema, defaulted)
  const cleaned = Value.Clean(schema, converted)
  try {
    return Value.Decode(schema, cleaned)
  } catch(error) {
    return error instanceof TransformDecodeError
      ? (() => { throw new ParseError([{
        type: ValueErrorType.Never,
        message: error.message,
        path: error.path,
        schema: error.schema,
        value: error.value
      }]) })()
      : (() => { 
        throw new ParseError([...Value.Errors(schema, value)]) 
      })()
  }
}

The TransformDecodeError contains most of the properties of ValueError, but does not contain the type. A Never ValueError seems reasonable here as as Refine() logic is unassociated with the schematic.

sinclairzx81 commented 4 months ago

@ShlokDesai33 Just keep in mind that there's no way for transform types to continue processing the value if they've encountered an error as there is no assurances the rest of the value can be processed (so you're only going to get one error out)

If you need a full range of errors (for the purposes of form validation), the only way you're going to get that is by encoding the "Refine" logic in the constraints themselves.

// Use this
const T = Type.String({ maxLength: 255 })

// Not this
const T = Refine(Type.String(), value => value.length <= 255, {
  message: "String can't be more than 255 characters"
})

Whether this works for your library I'm not sure. But alternatively, you could try your luck with Ajv which may be able to produce specialized refinements with custom errors.

sinclairzx81 commented 4 months ago

@ShlokDesai33 Just a quick follow up on this....

... and I'd like to show the user as many errors as possible (so they don't have to keep submitting the form).

Actually, I do think this makes a good case for adding support for Refine() to TypeBox, if only to have refinements run in Check() and Error() routines (which would enable multiple refinement errors to be generated during schema checks). For now, I've setup a branch to investigate an implementation. Preliminary documentation can be found at the link below.

https://github.com/sinclairzx81/typebox/tree/refine?tab=readme-ov-file#types-refinement

ShlokDesai33 commented 4 months ago

This sounds amazing, and would save me alot of effort! I'd offer to contribute if I knew what I was doing, but I'm still very much a noob lmao.

ShlokDesai33 commented 4 months ago

In the new branch, the refine function wraps existing schemas. What if a use case requires more than one refinement? Do we keep nesting refine function calls inside each other? Won't that be bad for code readability?

sinclairzx81 commented 4 months ago

In the new branch, the refine function wraps existing schemas. What if a use case requires more than one refinement? Do we keep nesting refine function calls inside each other? Won't that be bad for code readability?

I'm just working through a design atm. I agree it would be good to support multiple constraints / refinements. Here is one possible design which lines up to the design of Transform types.

const T = Type.Refine(Type.Number())
  .Check(value => value >= 0, 'Value must be greater than 0')
  .Check(value => value < 255, 'Value must be less than 255')
  .Done()

Thoughts?

ShlokDesai33 commented 4 months ago

This looks pretty good. Looking forward to seeing it in action!

sinclairzx81 / typebox

Applying refinements + general advice on my implementation #816