gcanti / io-ts

Runtime type system for IO decoding/encoding
https://gcanti.github.io/io-ts/
MIT License
6.7k stars 328 forks source link

[RFC] New error model #255

Open gcanti opened 5 years ago

gcanti commented 5 years ago

I started to think about a new error model.

This is what I get so far

export type DecodeError =
  | Leaf
  | LabeledProduct // suitable for product types indexed by labels
  | IndexedProduct // suitable for product types indexed by integers
  | And // suitable for intersection types
  | Or // suitable for union types

export interface Leaf {
  type: 'Leaf'
  actual: unknown
  expected: string
  message: string | undefined
}

export interface LabeledProduct {
  type: 'LabeledProduct'
  actual: unknown
  expected: string
  errors: Record<string, DecodeError>
  message: string | undefined
}

export interface IndexedProduct {
  type: 'IndexedProduct'
  actual: unknown
  expected: string
  errors: Array<[number, DecodeError]>
  message: string | undefined
}

export interface And {
  type: 'And'
  actual: unknown
  expected: string
  errors: Array<DecodeError>
  message: string | undefined
}

export interface Or {
  type: 'Or'
  actual: unknown
  expected: string
  errors: Array<DecodeError>
  message: string | undefined
}
mmkal commented 5 years ago

Could LabeledProduct and IndexedProduct be combined into one somehow? It'd make it easier to write custom error processors. Maybe something like this:

export interface IndexedProduct {
  type: 'IndexedProduct'
  actual: unknown
  expected: string
  errors: Array<[string | number, DecodeError]>
}

then errors could either be like [[0, error1], [1, error2]] or [['foo', error1], ['bar', error2]] but either way you can just use the keys from the first item of the tuple to index the root object.

sledorze commented 5 years ago

@mmkal that's information lost and you're one short function far from deriving what you're proposing from the current spec. So less value.

mmkal commented 5 years ago

What information is lost?

sledorze commented 5 years ago

@mmkal the Type of the Product

sledorze commented 5 years ago

@mmkal also consider that the proposed signature can lead to invalid representations as the array can be filled with heterogeneous tuples. [[0, error1], ['bar', error2]]

mmkal commented 5 years ago

the Type of the Product

I guess I'm a little unclear on what exactly Product means in this context.

the proposed signature can lead to invalid representations as the array can be filled with heterogeneous tuples

Good point.

A solution to both those issues while keeping the errors simple to process could be updating LabeledProduct to use the same style, without sharing a type with IndexedProduct:

export interface LabeledProduct {
  type: 'LabeledProduct'
  actual: unknown
  expected: string
  errors: Array<[string, DecodeError]>
}
mmkal commented 5 years ago

Semi-related: would it be possible/worth using generics rather than unknown? e.g.

export interface LabeledProduct<T> {
  type: 'LabeledProduct'
  actual: T
  expected: string
  errors: Array<[keyof T, DecodeError]>
}
sledorze commented 5 years ago

@mmkal The problem with making LabeledProduct using an Array is that it would rely on an implicit order from objects keys to array. At worst, you're relying on the default Js engine to perform that; but it can change from run to run; non deterministic, no added value, hard for tests. At best, you need to provide an Order for the object keys; more work to do, no added value.

For the second proposal; being a decoding error means the value is still unknown..

gcanti commented 5 years ago

I kind of like a sum type that retains as much meta info as possible. I guess that having such specialized members should make easier to write smarter reporters.

gcanti commented 5 years ago

btw I wrote some POCs in the v2 branch that can be backported once the new error model has beed settled.

For example here's a POC of a toDecodeError helper which trasforms a current Errors into a DecoderError and an experimental (non standard) TreeReporter that outputs something like

import * as t from '../src'
import { TreeReporter } from '../src/TreeReporter'

const Person = t.type({
  name: t.string,
  age: t.number
})

console.log(TreeReporter.decode(Person, {}))
/*
Expected { name: string, age: number }, but was {}
├─ Invalid key "name"
│  └─ Expected string, but was undefined
└─ Invalid key "age"
   └─ Expected number, but was undefined
*/
mmkal commented 5 years ago

@sledorze being a decoding error doesn't mean it's of type unknown. It means it's of type I for a runtime type t.Type<A, O, I>. It'd be really useful to know that for things like piped types where you can't just assume I = unknown

mmkal commented 5 years ago

In fact, it could go further than that. Instead of this signature

class Type<A, O, I> {
  decode(i: I): Either<DecodeError, A>
}

Where there's not so much we can do with a Left. The signature could be

class Type<A, O, I> {
  decode<Input extends I>(i: Input): Either<DecodeError<Input>, A>
}

This would give a meaningful type to actual even if the input type is unknown.

An example that comes to mind is a parser for an incoming express request. The whole request could be passed to decode, and then on validation error, the compiler would know that actual is of type express.Request, greatly increasing the ability to safely do something useful with the error.

sledorze commented 5 years ago

@mmkal you're right! it is true that the value was (and is) of type Input.

Introducing the Input type as type parameter means now you can get into a situation where two decoding functions may not share the same Error type.

We have to think about the consequences (or test that approach on real world code bases); I'm thinking about the fact that those function would previously share the same Monadic / Applicative signature and now would diverge due to that new type param and then would no more compose (but with a union of the Error types).

What you want to achieve is one mapLeft far however (even on actual implementation); question is; are you using that pattern today and what is the need?

mmkal commented 5 years ago

It's true that different decodes could in theory have different types. I'm not convinced that's a bad thing though, because typescript will treat it like the interface with actual: unknown in that case anyway:

const Person = t.interface({ name: t.string })
const wrong1 = Person.decode({ firstName: 'Alice' }) // Left<DecodeError<{ firstName: string }>>
const wrong2 = Person.decode({ fullName: 'Bob' }) // Left<DecodeError<{ fullName: string }>>

[wrong1, wrong2].forEach(validation => {
  validation.mapLeft(left => left.actual) // left.actual is of type `{ firstName: string } | { fullName: string }` - no worse than `unknown`
}

Plus, it's likely not a good sign if you're decoding many different input types into a single target type. The type system will just accurately reflect the complexity (and possible bugginess) of that model.

The real world scenario that I've come across that this would be helpful for is when using a Type as a decoder as opposed to a simple validator. Let's say the input type is an http response. The type validates that 200<=statusCode<300, and decodes the body by json parsing and validating with t.interface. The 'target' type is whatever the body should look like (this will be the 'right' value). If there's an error (status code or unexpected body format), it'd be useful to have actual: HttpResponse rather than actual: unknown, so we don't have to keep the original response in scope forever, or do an unsafe cast to get it back.

gcanti commented 5 years ago

@mmkal if I understand correctly you are proposing to make DecodeError polymorphic (DecodeError<I>), how would you rewrite the model in my first comment?

About the HttpResponse use case, I'm not sure is io-ts's resposability to keep track of the sorrounding type, IMO that's a job for the Either monad

import { Either, right, left } from 'fp-ts/lib/Either'

interface HttpResponse<A> {
  status: number
  body: A
}

const validateStatus = <A>(res: HttpResponse<A>): Either<string, HttpResponse<A>> => {
  if (200 <= res.status && res.status < 300) {
    return right(res)
  } else {
    return left(`Invalid status code ${res.status}`)
  }
}

const decodeHttpResponse = <A>(
  res: HttpResponse<unknown>,
  codec: t.Decoder<unknown, A>
): Either<string, HttpResponse<A>> => {
  return validateStatus(res).chain(res =>
    codec.decode(res.body).bimap(
      () => `Invalid body`,
      body => ({
        ...res,
        body
      })
    )
  )
}

const Body = t.type({
  foo: t.number
})

console.log(decodeHttpResponse({ status: 500, body: null }, Body)) // left("Invalid status code 500")
console.log(decodeHttpResponse({ status: 200, body: null }, Body)) // left("Invalid body")
console.log(decodeHttpResponse({ status: 200, body: { foo: 1 } }, Body)) // right({ "status": 200, "body": { "foo": 1 } })
mmkal commented 5 years ago

@gcanti this is roughly how I'd initially envision making DecodeError polymorphic:

export type DecodeError<I> =
  | Leaf<I>
  | I extends Array<infer X>
    ? IndexedProduct<X[]> // suitable for product types indexed by integers
    : LabeledProduct<I> // suitable for product types indexed by labels
  | And<I> // suitable for intersection types
  | Or<I> // suitable for union types

export interface Leaf<I> {
  type: 'Leaf'
  actual: I
  expected: string
  message: string | undefined
}

export interface LabeledProduct<I> {
  type: 'LabeledProduct'
  actual: I
  expected: string
  errors: {[K in keyof I]: DecodeError<I[K]>}
  message: string | undefined
}

export interface IndexedProduct<I extends any[]> {
  type: 'IndexedProduct'
  actual: I
  expected: string
  errors: Array<[number, DecodeError<I[number]>]>
  message: string | undefined
}

export interface And<I> {
  type: 'And'
  actual: I
  expected: string
  errors: Array<DecodeError<I>>
  message: string | undefined
}

export interface Or<I> {
  type: 'Or'
  actual: I
  expected: string
  errors: Array<DecodeError<I>>
  message: string | undefined
}

re the HttpResponse case - I meant the case where you want to use the response object on failure - but either way your sample would still work since res is in scope after calling validateStatus. However if I'm writing a library function, it's possible that someone using it will want to recover from errors using the original response object (or log headers to an error reporter, or whatever). So that person using it needs a reference to res. This can be done by adding a .mapLeft(error => ({ error, res }). But this shouldn't be necessary, and is potentially confusing under error model v2, where res and error.actual are the same value, but with error.actual is untyped.

The example you gave can reduce to just a type definition:

import * as t from 'io-ts'
import { Either, right, left } from 'fp-ts/lib/Either'

interface HttpResponse<A> {
  status: number
  body: A
}

const GoodResponse = <RT extends t.Type<any>>(bodyType: RT) => t.interface({
  status: t.refinement(t.Integer, s => 200 <= s && s < 300),
  body: bodyType,
})

const Body = t.type({
  foo: t.number
})

const FooResponse = GoodResponse(Body)

console.log(FooResponse.decode({ status: 500, body: null })) // left(DecodeError<{ status: 500, body: null }>) with info about status being invalid, and type-safe access to full http response
console.log(FooResponse.decode({ status: 200, body: null })) // left(DecodeError<{ status: 200, body: null }>) with info about body being invalid type-safe access to full http response
console.log(FooResponse.decode({ status: 200, body: { foo: 1 } })) // right({ "status": 200, "body": { "foo": 1 } })

Given that io-ts is keeping track of the value of actual, it seems beneficial to honestly and accurately report the type, if at all possible. The HttpResponse scenario is just an example. There are many situations where being sure of actual's type can either prevent complex code where more references need to be kept in scope than would otherwise be necessary, additional type-checking, or worst of all, casting to any.

gabejohnson commented 5 years ago

@gcanti I'm very much in favor of a new, more granular error model. The current makes traversal difficult.

gcanti commented 5 years ago

@mmkal this conditional type looks problematic

export type DecodeError<I> =
  | Leaf<I>
  | I extends Array<infer X>
    ? IndexedProduct<X[]> // suitable for product types indexed by integers
    : LabeledProduct<I> // suitable for product types indexed by labels
  | And<I> // suitable for intersection types
  | Or<I> // suitable for union types

Example

type Test = DecodeError<unknown>

expands to

type Test = Leaf<unknown> | LabeledProduct<unknown> | And<unknown> | Or<unknown>

which is false for a t.array

declare function decode<A, O, I, V extends I>(codec: t.Type<A, O, I>, v: V): Either<DecodeError<V>, A>

const input: unknown = [1, '2', 3]

decode(t.array(t.number), input).mapLeft(e => e.type) // "Leaf" | "LabeledProduct" | "And" | "Or"

Indeed the error shape is related to the codec (specifically to its internal implementation) rather than the input value.

mmkal commented 5 years ago

Good point. Maybe if you want to avoid that, DecodeError would need to be defined as

export type DecodeError<I> =
  | Leaf<I>
  | IndexedProduct<I extends any[] ? I : unknown[]> // suitable for product types indexed by integers
  | LabeledProduct<I> // suitable for product types indexed by labels
  | And<I> // suitable for intersection types
  | Or<I> // suitable for union types

instead? The only other thing I can think of would be to write some special-case hack involving bringing back mixed:

type mixed = { [key: string]: any } | object | number | string | boolean | symbol | undefined | null | void
type IsUnknown<T> = T extends mixed ? false : true

those all evaluate to false. But you'd probably need another special case for any, and there may be other edge cases too.

qzmfranklin commented 5 years ago

I have been using this repo with fp-ts for about a year by now. Really liked it.

I intend to keep up to date with the upgrades.

Could anyone help me understand what I should do and prepare to migrate to v2?

Thanks!

sledorze commented 5 years ago

@qzmfranklin you may find relevant information at the end of that thread, with consolidate recent migration trials feedbacks. https://github.com/gcanti/fp-ts/issues/823

qzmfranklin commented 5 years ago

@sledorze Thanks!

ggoodman commented 5 years ago

@gcanti is this part of the project still on your radar? I think it would be a significant value-add for users looking to use io-ts as a general-purpose validator that can be used to produce human-understandable error messages. Right now, I struggle to produce something meaningful from complex codecs.

chrischen commented 4 years ago

Should the structure of errors simply return an error code type and then let the user decide what to do with it?

In particular the pattern espoused here of returning error codes and then letting the user determine what message to display for what error code is difficult to implement with the v 2.2+ decoder API since the built in error reporter seems to have error message strings (in English) hard-coded. https://www.slideshare.net/ScottWlaschin/railway-oriented-programming (slide 135)

gcanti commented 4 years ago

@ggoodman the new experimental APIs produce human-understandable error messages by default

import * as E from 'fp-ts/lib/Either'
import * as D from 'io-ts/lib/Decoder'
import { draw } from 'io-ts/lib/Tree'

interface NonEmptyStringBrand {
  readonly NonEmptyString: unique symbol
}

type NonEmptyString = string & NonEmptyStringBrand

const NonEmptyString = D.refinement(D.string, (s): s is NonEmptyString => s.length > 0, 'NonEmptyString')

const Person = D.type({
  name: NonEmptyString,
  email: NonEmptyString
})

console.log(E.fold(draw, String)(Person.decode({ name: null, email: '' })))
/*
required property "name"
└─ cannot decode null, should be string
required property "email"
└─ cannot refine "", should be NonEmptyString
*/
ljani commented 3 years ago

The new error model looks great!

I kind of like a sum type that retains as much meta info as possible. I guess that having such specialized members should make easier to write smarter reporters.

With 3.0.0, how do change the error message (similar to withMessage) with sum types? I'm trying to do form validation and I'd like to return Field required or something, if the tag is empty. For reference, here's the current error message: https://github.com/gcanti/io-ts/blob/f13a10ae9d405f3fb8564cf3b7917d31aa7c0e27/src/Decoder.ts#L338

Here's my current hack:

function deMapKeyErrors<E>(
    map: (
        errors: FS.FreeSemigroup<ITDE.DecodeError<E>>,
        key: string,
        kind: ITDE.Kind,
    ) => FS.FreeSemigroup<ITDE.DecodeError<E>>,
): (d: ITDE.DecodeError<E>) => ITDE.DecodeError<E> {
    return (e: ITDE.DecodeError<E>) => {
        return e._tag === "Key"
            ? ITDE.key(e.key, e.kind, map(e.errors, e.key, e.kind))
            : e;
    };
}

function deMapLeafError<E>(
    map: (key: E, actual: unknown) => E,
): (d: ITDE.DecodeError<E>) => ITDE.DecodeError<E> {
    return (e: ITDE.DecodeError<E>) => {
        return e._tag === "Leaf"
            ? ITDE.leaf(e.actual, map(e.error, e.actual))
            : e;
    };
}

function fsMapOf<A>(
    map: (key: A) => A,
): (d: FS.FreeSemigroup<A>) => FS.FreeSemigroup<A> {
    return (e: FS.FreeSemigroup<A>) => {
        return e._tag === "Of" ? FS.of(map(e.value)) : e;
    };
}

function mapTagError<E>(
    makeError: (actual: unknown) => string,
    key: string,
): (
    i: unknown,
    d: FS.FreeSemigroup<ITDE.DecodeError<string>>,
) => FS.FreeSemigroup<ITDE.DecodeError<string>> {
    return (i: unknown, d: FS.FreeSemigroup<ITDE.DecodeError<string>>) => {
        return pipe(
            d,
            fsMapOf(
                deMapKeyErrors((e, k) =>
                    k === key
                        ? pipe(e, fsMapOf(deMapLeafError(makeError)))
                        : e,
                ),
            ),
        );
    };
}

const Test = ITD.mapLeftWithInput(mapTagError(() => "my message", "u"))(
    ITD.sum("u")({
        x: ITD.type({
            u: ITD.literal("x"),
            name: ITD.string,
        }),
        y: ITD.type({
            u: ITD.literal("y"),
            name: ITD.string,
        }),
    }),
);

Maybe something like D.sum(tag: T, makeError: (actual: unknown) => string)?

ljani commented 3 years ago

Silly me, I can just mark my proper withMessage errors with Wrap of DE.fold and let the marked messages take priority over fallback messages generated in Leaf.