colinhacks / zod

TypeScript-first schema validation with static type inference
https://zod.dev
MIT License
33.93k stars 1.18k forks source link

Type inference is too loose in combination with generic functions #3309

Open STREBER24 opened 8 months ago

STREBER24 commented 8 months ago

Motivation

I am trying to use Zod to validate and transform data received from an external API. It's shape nests lots of information in objects with a single key called data. For example, instead of

["a", "b", "c"]

we might receive

[{data: "a"}, {data: "b"}, {data: "c"}]

Transforming {data: "a"} to "a" works as expected via the schema

z.object({ data: z.string() }).transform((a) => a.data)

Problem

Since this pattern is so common, I want to extract this into a function:

function dataObj<T extends z.ZodTypeAny>(schema: T) {
  return z.object({ data: schema }).transform((a) => a.data);
}

According to the documentation, I understand this to be the correct usage of generic parameters.

Unfortunately, the type inferring capabilities of TypeScript seem to fail, as the resulting type (determined by hovering over it in VS Code) always contains an unwanted undefined:

const test = dataObj(z.string())
z.infer<typeof test> // string|undefined, should be string

Workaround

I can work around the problem by adding a type assertion in transform. But I assume, it should also work without this, since it works if the function is inlined.

function otherDataObj<T extends z.ZodTypeAny>(schema: T) {
  return z.object({ data: schema }).transform((a) => a.data as z.infer<T>);
}

const test = otherDataObj(z.string())
z.infer<typeof test> // string, ok

Additional information

I am using TypeScript 5.4.2 and Zod 3.22.4.

I have prepared the example on codesandbox.io.

I hope, you can help me with this problem. Thanks for investing your time.

colinhacks commented 8 months ago

This happens because, at the point where z.object() is called, schema could be any subtype of ZodTypeAny. Because any extends undefined, Zod is adding a question mark to the inferred type of T.data.

  type requiredKeys<T extends object> = {
    [k in keyof T]: undefined extends T[k] ? never : k;
  }[keyof T];

  export type addQuestionMarks<
    T extends object,
    R extends keyof T = requiredKeys<T>
  > = Pick<Required<T>, R> & Partial<T>;

There's not much I can do about this in Zod as far as I can tell, aside from carve out an exception to the addQuestionMarks for ZodTypeAny, which is pretty hacky. I'll ponder this.

STREBER24 commented 8 months ago

Thanks for your response.

We thought, that when using the function with e.g. z.string(), Typescript should be able to work out, that z.string() is not optional, but this seem to be a Typescript-problem and not a Zod-problem.

Another possible fix would be having a zod-type, that can be anything but undefined, so we could define our function on this type.

I also found, that for some reason, the following definition of my function works just fine:

function singleKeyObj<T extends z.ZodTypeAny, K extends string>(
  key: K,
  schema: T
) {
  return z.object({ [key]: schema }).transform((a) => a[key]);
}

function dataObj<T extends z.ZodTypeAny>(schema: T) {
  return singleKeyObj("data", schema);
}

const test = dataObj(z.string())
z.infer<typeof test> // string, ok
colinhacks commented 8 months ago

Inference is certainly full of mysteries. Perhaps @Andarist or another kindly expert on TypeScript internals has a better understanding here?

anything but undefined

Unfortunately this isn't a type that can be easily represented in TypeScript without using a giant union of subclasses, and that would likely cause downstream performance problems in the TypeScript compiler.

Andarist commented 8 months ago

Anything but undefined is {} | null

Andarist commented 8 months ago

Ugh, there is a lot to unpack here when it comes to those 2 examples 😅 If I don't forget I'll take another (deeper) look at this over the weekend

Andarist commented 8 months ago

The variant with an extra generic K makes the return type of .transform's callback deferred. So when you finally instantiate it with some concrete zod type, it's able to use that deferred type, instantiate it and resolve to string.

The variant with a concrete prop doesn't defer. When you access .data on it in your .transform, TS assumes that it's pulling a property from an object that might have that property optional. That's because you are accessing it on a mapped type like this:

type Input = { [k_1 in keyof z.objectUtil.addQuestionMarks<z.baseObjectOutputType<{
    data: T;
}>, undefined extends T["_output"] ? never : "data">]: z.objectUtil.addQuestionMarks<z.baseObjectOutputType<{
    data: T;
}>, undefined extends T["_output"] ? never : "data">[k_1]; }

We can see that this object has all of its properties conditionally optional. Since it doesn't defer access on this mapped type, it reads from it, substitutes things, and ends up creating a union with | undefined. Note that the inner "value" type stays deferred:

type Output = z.objectUtil.addQuestionMarks<z.baseObjectOutputType<{
    data: T;
}>, undefined extends T["_output"] ? never : "data">["data"] | undefined

This outer access could also be deferred. Deferring is - at times - not practical. So it's hard to tell on the spot what really really should happen here. I could see this being classified as a design limitation. It's not like it's incorrect now - it's just not precise enough. At the generic level itself, this .data access is indeed potentially accessing an optional property.

You can see the above here. I recommend inspecting the added twoslash queries and the emitted declarations for both working and broken.

I was also able to create a minimal~ repro case for this here. You could use this to raise an issue/question about this behavior in the TS repo.