microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
101.08k stars 12.5k forks source link

Covariant quantifiers are specialized to widest possible type in conditional types, should be narrowest #42491

Open masaeedu opened 3 years ago

masaeedu commented 3 years ago

🔎 Search Terms

Generics, variance, conditional types, inference

🕗 Version & Regression Information

This is the behavior in every version I tried, and I reviewed the FAQ for entries about conditional types (specifically the note about falling through to never and "distributive conditional types", which is not what's happening in my example)

⏯ Playground Link

Playground link with relevant code

💻 Code

type Fail<msg extends string> = { error: msg, imaginary: never }
type MixtureOf<r> = r[keyof r]
type VariantOf<r> = MixtureOf<{ [k in keyof r]: { tag: k, value: r[k] } }>

// A type constructor with a *covariant* type parameter
type Thunk<o> = () => readonly o[]

// Some examples
const stringThunk: Thunk<string> = () => ["foo", "bar"]
const numberThunk: Thunk<number> = () => [1, 2]
const polyThunk: <a>() => readonly a[] = () => [] // This one is polymorphic!

type TerminalThunk = Thunk<unknown> // For any x, a Thunk<x> is assignable to TerminalThunk
const someThunks: readonly TerminalThunk[] = [polyThunk, stringThunk, numberThunk]

type InitialThunk = Thunk<never> // For any x, an InitialThunk is assignable to Thunk<x>
const initialThunk: InitialThunk = polyThunk
const theAllThunk: typeof polyThunk & typeof stringThunk & typeof numberThunk = initialThunk

// We can use a (distributive) conditional type to extract the type parameter of a Thunk
type OutputOf<thunk> = thunk extends Thunk<infer x> ? x : Fail<'whoops'>
// ... and to extract the type parameters of a number of Thunks
type OutputsOf<thunks> = { [k in keyof thunks]: OutputOf<thunks[k]> }

// We'd like to be able to work with the sum of the (covariant) parameters of a number of Thunk<_> types, some of which may be polymorphic.
type collect = <thunks extends Record<string, TerminalThunk>>(thunks: thunks) => Thunk<MixtureOf<OutputsOf<thunks>>>
const collect: collect = thunks => () =>
  // @ts-ignore
  Object.values(thunks).flatMap(v => v())

// Inferred as Thunk<MixtureOf<OutputsOf<...>>>
const test1 = collect({ stringThunk, numberThunk })
const test1_: readonly (string | number)[] = test1() // Good!

// Inferred as readonly unknown[]
const test2 = collect({ polyThunk, stringThunk, numberThunk })
const test2_: readonly (string | number)[] = test2() // Bad! Type 'readonly unknown[]' is not assignable to type 'readonly (string | number)[]'.

// The runtime results are identical
console.log(test1())
console.log(test2())

🙁 Actual behavior

The assignment failed with the type error:

Type 'readonly unknown[]' is not assignable to type 'readonly (string | number)[]'.
  Type 'unknown' is not assignable to type 'string | number'.
    Type 'unknown' is not assignable to type 'number'.(2322)

🙂 Expected behavior

I expect the assignment in test2_ to succeed without any further type annotation.

I'm not completely certain, but I suspect the problem arises from the fact that OutputOf<typeof polyThunk> = unknown, which is then summed with and absorbs string and number (due to MixtureOf).

As near as I can figure out, for covariantly quantified generic functions, a (distributive) conditional type seems to simply specialize all the quantifiers to their respective upper bounds before unifying with the RHS of the extends operator (the default upper bound being unknown). So for example:

type Test0 = OutputOf<<x extends unknown>() => readonly x[]> // Test0 = unknown
type Test1 = OutputOf<<x extends string>() => readonly x[]>  // Test1 = string
type Test2 = OutputOf<<x extends number>() => readonly x[]>  // Test2 = number

But even this rule doesn't seem to hold in general, :shrug: :

type Test3<ub> = OutputOf<<x extends ub>() => readonly x[]>  // Test3<ub> = unknown ???

Regardless, at least in the special case of unbounded covariant quantifiers, does it make any sense to instead produce never? I haven't thought about it super hard, but since all of TypeScript's inference problems are in "positive position" (i.e. you never try to infer the types of parameters), it seems like specializing any generic function to the narrowest possible type would do something good.

Cue the counterexamples of where this is nonsense...

RyanCavanaugh commented 3 years ago

I'm not sure what angle to approach this from, so I'll be a bit verbose here and list a bunch of things I think are facts to see where we're disconnected.

If we write collect({ polyThunk }), this expression should produce () => unknown[], same as if you had written this:

// p: Thunk<unknown>
const p = polyThunk();

wherein an inference with zero candidates is correctly defined to produce unknown.

A covariant mixing of readonly string[] and readonly unknown[] should produce readonly unknown[], since an element in readonly unknown[] might be inhabited by anything.

The inference of collect({ polyThunk, stringThunk, numberThunk }) is correctly () => readonly unknown[] because we have no evidence of what polyThunk's return type should be (*).

The assignment of test2_() to test2 is clearly unsound given the existing inferences, so is correctly an error.

Given the initializer to polyThunk, I think its correct definition is this, which gives the desired behavior.

const polyThunk: () => readonly never[] = () => []

* Perhaps critically, not a footnote - in general our inference algorithm is not philosophically compatible with functions like polyThunk, since they claim to be able to manufacture the correct type out of thin air, but our inference algorithm approaches generics as if they relate input values to output values via plausible runtime means. When no input positions exist to inform possibly-correct inferences, we default to the soundest choice unknown.

Choosing never over unknown in the absence of any candidates is incorrect; if you wrote a function like this

function fn<T>(arg?: T) {
  return arg;
}
const m = fn();

It would be a true unsoundness to let never inhabit m in a program that's still running.

masaeedu commented 3 years ago

@RyanCavanaugh Thanks, that's a good way to approach this. I think I already disagree with you when it comes to:

If we write collect({ polyThunk }), this expression should produce () => unknown[], same as if you had written this:

I think it should produce () => readonly never[]. Could we focus in on that one and talk about why unknown[] vs never[]? I'm still thinking about the rest of your comment but it's probably easier to talk about it once we're on the same page wrt the first thing.

masaeedu commented 3 years ago

I mean of course inferring unknown[] is sound, but in general inferring the type of any expression to be unknown is sound (at least in "positive positions); it's just not very useful. Unsoundness follows indirectly because the user now has to do unsafe casts to do anything useful with the result, and the more unsafe casting you do the more mistakes you can make, lying to yourself and to the typechecker.

Now that I've thought a bit more about your function fn example (sorry, i'm jumping around a bit) I think I disagree with that as well. We need to explicitly write the return type of fn to see why:

function fn<T>(arg?: T): T | undefined {
  return arg;
}

const m = fn();

Note the _ | undefined, which is material. The _[] in my original example is analogous. Today we infer unknown | undefined = unknown, but I believe this is wrong/suboptimal as well. I believe the most precise thing to infer here is instead never | undefined = undefined.

masaeedu commented 3 years ago

I think there's a less imprecise way for me to motivate why inferring never makes sense. When we apply fn(), the most straightforward thing to infer is neither unknown | undefined nor never | undefined: it's just <T> (T | undefined), in an imaginary type syntax for #17574. But absent the ability to actually model <T> ...anything but a function..., we should pick the closest equivalent.

So for some covariant type constructor X<_>, and the type <T> X<T> which we cannot (currently) represent in TypeScript, what should we pick as its closest equivalent: X<unknown> or X<never>?

My understanding is that if X is covariant, <T> X<T> is equivalent to X<never>. I don't know how to define "covariant" precisely in a type theory context, but I can try to "define" it by many examples:

I can also define it roundaboutly by saying that if it's possible to implement: <a, b>(f: (a: a) => b) => (fa: F<a>) => F<b>, then F<_> is covariant. Can you think of some examples where F<_> is covariant but F<never> is not semantically equivalent to <T> F<T>?

We can look to other type systems for inspiration too. In e.g. Haskell there is the type Void which has no inhabitants (similar to never). It is also possible in Haskell to construct the "generic" type forall x. F x, where F is some type constructor besides a function. The relevant fact is that in Haskell the types forall x. F x and F Void are isomorphic whenever F is covariant. I understand that Haskell lacks subtyping, and it's not a one to one analogy to our current discussion, so we should take all this with a grain of salt. But I do think this confirms at a semantic level the soundness of inferring never in the situations we were discussing above.

RyanCavanaugh commented 3 years ago

Let me try to TL;DR: When T<U> appears in a function's return type when we're inferring U, the variance of T should inform the zero-candidate inference of U:

Is that right?

Edit: More precisely, we should measure the each parameter (inverted) and return type relative to each zero-candidate type parameter in order to determine the resulting inference.

masaeedu commented 3 years ago

@RyanCavanaugh Yes, that's exactly right, at least when there's no extends clauses.

stephanemagnenat commented 3 years ago

I found a bug that looks like similar to this one, but I am unsure, so I post it here. Consider the following code:

type StringMap = { [key: string]: string };

const a: StringMap = {
    'a': 'aData',
    'b': 'bData'
};

function f<T extends StringMap>(m: T) {
    for (const k in m) {
        const keyOfM: keyof T = k;
        const v1: string = m[keyOfM];
        const v2: string = m[k];
    }
}

f(a);

Surprisingly, the line assigning v1 compiles, but the line assigning v2 does not, although the first has a wider type for the indexing variable than the second.

stephanemagnenat commented 3 years ago

As an interesting note, the following works (assuming same a):

type S<SMK extends string> = {
    [K in SMK]: string;
};

function f<SMK extends string>(m: S<SMK>) {
    for (const k in m) {
        const v2: string = m[k];
    }
}

f(a);
TylorS commented 3 years ago

I've been simulating this with a type-level map, and I know of one other library that's actively using a similar technique to provide variances to higher order kinds.

I would love to see support for Covariance/Contravariance either implicitly or explicitly defined.