microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.28k stars 12.39k forks source link

Suggestion: existential unions and intersections #55257

Open angryzor opened 1 year ago

angryzor commented 1 year ago

Suggestion

πŸ” Search Terms

overload existential union intersection set operation correspondence problem switch return problem extends oneof generic indexed access type dependent function generic narrowing Generic derived value type conditional type inference enum enumerated narrowing branching generic parameter type guard generic bounds narrow generics correlated union record types This expression is not callable none of those signatures are compatible with each other

βœ… Viability Checklist

My suggestion meets these guidelines:

⭐ Suggestion

This issue suggests a potential solution to many different issues and cases of unsoundness in TypeScript that have remained unsolved for years (or have been addressed with a suboptimal workaround). These are some relevant issues I found but I'm quite sure this is not exhaustive:

All of these issues ultimately stem from the same problem: TypeScript currently does not have an accurate type representing an index into (or, rather, access of) a union or intersection of types. The type returned by { x: string } | { x: number }['x'] is string | number but this is not really correct. It has the semantics that any of string or number can be written to it, which would be an invalid assumption in this case. If you want to write to this x you need to use a type that is either string & number, or you need to somehow know that the type you hold is the same as the one in the union. TypeScript currently works around this by reusing the "write types" used by setters for unions' and intersections' properties to set the write type to string & number, but this is a suboptimal solution which results in most of the problems above.

This suggestion introduces 2 new types representing these "accesses of a union/intersection" that are not only a logical consequent of the structure and behavior of unions/intersections, but also naturally display the real desired behavior and are able to track which types are really inside those unions, allowing them to fan out over other unions/intersections like overloads and fixing the oneof problems as well.

I have been working on a PoC fork that implements the concepts introduced in this issue for a few months, and it is still a work in progress. However, the semantics have by now cleared up significantly (with a huge shoutout and thanks to the amazing @mkantor whose feedback and input have been invaluable!) to the point where I believe it is useful to create this suggestion issue to get some feedback from the maintainers and the TS community in general.

My apologies for the length of this text. The contents of this description are the result of a few months of R&D work and as the ideas are related to some of the core mechanics of TypeScript it is difficult to condense it all into a very short description.

πŸ“ƒ Motivating Example

Currently unions and intersections form a single unit in the type tree. The core idea of this suggestion is that these can be pulled apart into 2 interconnected elements: a varying type and an iteration point. I.e. the following union:

string | number

Can also be written as such:

allof (string || number)

Where allof is a type operator that forms the aforementioned "iteration point", the location in the type tree where the possible variants are "collected", a form of scope for the || under it, and string || number describes the various types that the union can assume. There is an equivalent operator && for intersections.

For a simple union like above this seems useless, however the power of this idea shows when the varying type is more deeply nested. In the rest of this issue's text, assume that equals is an imaginary constraint keyword that tests for type equality. The following equalities would hold:

// A union of objects
{ x: string } | { x: number } equals allof { x: string || number }

// A cartesion product
[string, string] | [string, number] | [number, string] | [number, number] equals allof [string || number, string || number]

// A union of homogeneous arrays
string[] | number[] equals allof ((string || number)[])

// A union of objects with the same type for multiple properties
{ a: string, b: string } | { a: number, b: number } equals allof Record<'a' | 'b', string || number>

// An overloaded function
((a: string) => void) & ((a: number) => void) equals allof ((a: string || number) => void)

You can see that in that last example inside the function scope the parameter a has as its type a naked string || number. It is this naked form of || and && that solves the issues and unsoundnesses mentioned in the introduction. I refer to it as an "existential union/intersection", and to standard unions/intersections as "universal union/intersection". Whether this is entirely correct terminology I am not sure as I am not a type theoretic, but I will refer to them like this regularly in the rest of this issue.

My proposal is to implement the following type operators:

Note that this allof + || or && notation is simply an alternate notation (with its own limits) for the more general idea that these unions and intersections are type level loops, or maps rather (not a new idea, already used in distributive conditionals in a more rigid way):

// this
allof (string || number)[]
allof (string && number)[]

// could also be written as the following (pseudocode):
union where X ∈ {string, number} in X[]
intersection where X ∈ {string, number} in X[]

// or the following, which is really attractive because it uses the
// same syntax that mapped types use:
X in string | number as X[]
X in string & number as X[]

// though it would have to be extended to allow selection of the desired
// set operation if you want to remove ambiguity with single items:
X in union string & number as X[] // => (string & number)[]
X in intersection string | number as X[] // => (string | number)[]

I have chosen this specific allof notation in this issue because it merges the concept of the X in the above code and a standalone ||, which makes it seem less like the || and && types come from nowhere and helps with demonstrating the ideas presented here. However it also has some drawbacks, which I will address later.

The meaning of a standalone || or &&

The motivating problem that started my research is the following common issue encountered by users, which has been dubbed the "correspondence problem" by users in the TS community discord (not to be confused with the undecidable "post correspondence problem"):

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare const o: Obj

o.obj[o.key]

Users expect to be able to do this index operation, but it is not allowed. What happens is that TS sees o.obj as having type { foo: string } | { bar: string } and o.key as having type 'foo' | 'bar'. When the index occurs TypeScript does not realize that the types of both properties correspond to the same option in the union Obj, and tries to assign o.key to its contravariant complement: an intersection of the union's constituents. This fails.

The information we lost here is the location of the iteration point of Obj. TypeScript should never have taken o.key to be of type 'foo' | 'bar'. Instead its correct type is the type 'foo' || 'bar'.

So what does the type A || B mean as a standalone type? It is an existential type that corresponds to "either A or B", but you don't know which it is. It is a type that holds a choice that was made between A and B, but where you have forgotten which of the two was chosen. This is different from the union A | B. While A | B can be assigned with either A or B, A || B can either only be assigned an A, or only a B, it's just that you don't know which of the 2 options is correct. It signifies a choice that was made between the 2 somewhere in a higher scope (at the level of the corresponding allof, or the global scope if there is none), and in your current scope you don't know which of the 2 options was chosen so you have to consider both of them. It is almost like a type of multiple types, with the semantics that it is really only one of them. In that way, it has 2 different perspectives of looking at it:

It is important to understand that to local code this really is a unary type. It is either A or it is B, the fact that there are multiple types that it could be does not change the fact that it is only one of them. This also means that A | B cannot be assigned to A || B.

It is actually similar to a type variable, with the difference that a type variable is by definition unbounded on the subtype side, whereas these types are fully bounded. A || B is either A or B, not any of its subtypes. Whereas in type variables you make a choice from an infinite (constrained) set, in these types you make a choice from a finite set, and the choice between || or && adds some extra semantics to the way they can be handled.

There is an equivalent type for intersections: &&. A && B is a type that is both A and B, but in the local scope we are only looking at one of them. It is like a type representing a set of multiple types with intersection semantics. This one is harder to wrap your head around at first but it solves multiple soundness issues.

Contravariant positions swap between && and || just like with regular unions and intersections.

Converting between universal and existential at the type level

Existential -> Universal: allof

allof is a type operator that serves to create an iteration point at which descendant existential types are iterated, e.g.:

type Foo = allof {
    x: A || B
    y: C || D
}

// would be equivalent to
type Foo = { x: A, y: C } | { x: A, y: D } | { x: B, y: C } | { x: B, y: D }

type Bar = allof {
    x: A || B
    y: C && D
}

// would be equivalent to
type Bar = { x: A, y: C } & { x: A, y: D } | { x: B, y: C } & { x: B, y: D }

Existential type declarations are lexically scoped to their surrounding allof operator:

type Foo = allof {
    x: A || B
    y: allof {
        z: C || D
        w: E || F
    }
}

// is equivalent to
type Foo =
    | { x: A, y: { z: C, w: E } | { z: C, w: F } | { z: D, w: E } | { z: D, w: F } }
    | { x: B, y: { z: C, w: E } | { z: C, w: F } | { z: D, w: E } | { z: D, w: F } }

To retain soundness and referential transparency, it is important that inner allofs do not iterate over existentials that were declared outside their scope. Scoping is lexical:

type B<X> = allof { x: X, y: C || D }
type C = B<A || B>
//   ^? = { x: A || B, y: C } | { x: A || B, y: D }

This allof type operator is a useful construct to talk about the oneof operator and the behavior of the existential set operations in a symbolic way because it is a direct representation of the "iteration point". It could be implemented as described here, but it is not strictly required to implement the existential unions/intersections. It is simply an alternative way to generate distributive unions/intersections, and maybe not the most intuitive.

Universal -> Existential: oneof

oneof "picks" one of the types of a universal set operation, it creates a new instance of the existential variant of the input union or intersection:

type T = A | B
type A = oneof T
type B = oneof T

A and B here are 2 distinct, incompatible A || B types.

oneof is sort of the inverse operation of allof, but not entirely. You an think of it as lopping off a top level allof:

type A = oneof allof (A || B)
//   ^? = A || B

But with the caveat that A here is a new instance of A || B, it is not somehow reaching in to the inner A || B itself.

Typechecking existential unions and intersections

Existential type identity

An important property of these existential types is that each of them has its own identity, and they cannot be assigned to each other even if they have the same structure, just like type parameters with the same constraint can't be assigned to each other. If you have:

const a: A || B
const b: A || B

a = b // error!

The assignment will not work because there is no way to be sure that the choice made in typeof a is the same as the choice made in typeof b. However, if both variables have exactly the same type the assignment is allowed, as now we can be sure that the same choice was made:

const a: A || B
const b: typeof a

a = b // success!

"infectiousness" and tracking of choices

Another interesting property of these types is that, because they have lost their iteration point, they show "infectious" behavior. Types that include them in their type tree become themselves existential unions/intersections:

{ a: A || B } equals { a: A } || { a: B }

With one caveat: their choices are linked together. The above only works if in the latter type { a: A } is chosen when in the former one A is chosen, and analogously for B.

This shows a core difference between | and ||: while ({ a: A } | { a: B })['a'] should not return A | B, with the existential type we do have this equivalence: ({ a: A } || { a: B })['a'] is A || B, but with their choices linked together. This can be trivially shown: { a: A } || { a: B } is equivalent to { a: A || B }, and { a: A || B }['a'] trivially has the type A || B. Once we pass through the allof barrier we are in a context where all these types are equivalent, as long as their choices are linked.

This infectious behavior is the reason why allof works at any depth, like a form of scope. You can always rewrite the type below it in a way that puts the existentials at the top level.

It also has as a result that during things like typechecking, overload resolution etc. these types need to be iterated over on the top level (in the order of their corresponding allof scope nesting). This is especially important because if the same existential type occurs in multiple places it must pick the same type for every location. This is different from universally quantified unions and intersections, which in the type checker are iterated over when their location in the type tree is reached (because they have an iteration point), and thus don't have any interdependence between them.

Finding correspondence between types through this choice tracking

If we go back to the problem above, we can see that this inherent tracking of choices solves our problem. If we use || instead of |:

type Obj =
    || { obj: { foo: string }, key: 'foo' }
    || { obj: { bar: string }, key: 'bar' }

declare const o: Obj

o.obj[o.key]

Since the type of o.obj is { foo: string } || { bar: string } with its choices linked to the choice made in Obj and the type of o.key is 'foo' || 'bar' with its choices linked to the same choice, it can be worked out that this operation will always succeed! (I refer to this sometimes as "proving correspondence".)

Contravariant positions

Existential set operations in contravariant positions swap to their contravariant complement just like normal unions and intersections:

type Fn = allof ((a: string || number) => void)

// is equal to
type Fn = allof (
    && ((a: string) => void)
    && ((b: number) => void)
)

// is equal to
type Fn =
    & ((a: string) => void)
    & ((a: number) => void)

// is equal to
type Fn = {
    (a: string): void
    (a: number): void
}

Generalized subtyping behavior on naked existentials and how it naturally leads to mutation soundness

Comparing the subtyping behavior of universal and existential set operations, we find some interesting properties. First, as a reminder:

The subtyping behavior of |: A | B extends C iff A extends C AND B extends C C extends A | B iff C extends A OR C extends B so it's every extends some

The subtyping behavior of &: A & B extends C iff A extends C OR B extends C C extends A & B iff C extends A AND C extends B so it's some extends every

Now, for the existential variants, the thing here is that when we have a naked existential, we don't know which type was actually chosen. This limits what we can do with it if the choices between the 2 types being checked are not interlinked:

The subtyping behavior of ||: A || B extends C iff A extends C AND B extends C -> same as | C extends A || B iff C extends A AND C extends B (!) -> Since we don't know which one was chosen, we can only say it's a subtype if it's a subtype of both so it's every extends every

Note how this naturally produces the mutation behavior that we are currently using write types for, but in a more sound way and with more functionality: if the type of ({ a: A } | { a: B })['a'] is A || B, then there are only 2 options for us to assign to this type:

The subtyping behavior of &&: A && B extends C iff A extends C OR B extends C -> same as & C extends A && B iff C extends A AND C extends B (!) -> Since we don't know which one was chosen, we can only say it's a subtype if it's a subtype of both so it's some extends every

Again this naturally produces the desired mutation behavior: if the type of ({ a: A } & { a: B })['a'] is A && B, then there are only 2 options for us to assign to this type:

The difference between & and && is more subtle but note that

  1. We are allowed to assign only one of the 2 constituents to && if we can prove correspondence, which you can't do with universal intersections.
  2. Contravariant positions inside the &&'d type create existential unions, which resolve many unsound behaviors in the current implementation.

Some even more high level subtyping laws that hold:

A & B extends A && B extends A extends A | B
A & B extends A || B extends A | B

Whether A && B can really be assigned to A is not 100% sure yet and has to be sanity checked, but is probably true.

Automatic allof scoping for functions

One could wonder what a function definition with existentials in its signature would mean:

function fn(x: A || B) {
    // ...impl
}

Such a function signature is rather useless. It can only be called with either typeof x or A & B, but there is no way to create a value of type typeof x, so the signature could as well just be A & B.

However, existentials in the function signature are actually really useful. In fact they solve one of the original problem statements. They can be used to define overloaded functions that require the same choice in multiple places in the signature:

// the imaginary definition
allof function fn(x: A || B): typeof x {

}

// would have the type
interface Fn {
    (x: A): A
    (x: B): B
}

Because I'd prefer avoiding value level syntax though and because raw existential type parameters are useless anyway, all function types are automatically wrapped in an allof scope:

function fn(x: A || B): typeof x {

}

// would have the type
interface Fn {
    (x: A): A
    (x: B): B
}

Interactions in value context

Existential types interact with their universal counterparts in various ways whenever an allof border is crossed, which causes a kind of fan-out and fan-in behavior alongside linked choice matching. I have currently identified 4 critical junction points:

Passing an existential union/intersection to a parameter

Since the existential unions/intersections represent a unary type, they can be passed to overloaded functions if their existential semantics are met:

declare const fn1: {
    (a: string, b: string): void
    (a: number, b: number): void
}

function fn2(x: string || number) {
    fn1(x, x); // works
}

Iteration occurs at the top level, even before overload selection. Since both potential instantiations have an acceptable overload available, the call succeeds. Existential intersections would work the same, except they only require one of their instantiations to have a valid overload.

If the function returns different values per overload, these values are collected again into a linked existential (it may be linked to multiple other existentials if there were multiple existential inputs) with the same semantics as the input parameters:

declare const fn1: {
    (a: string, b: string): boolean
    (a: number, b: number): symbol
}

function fn2(x: string || number) {
    return fn1(x, x); // works
    //     ^? = boolean || symbol
}
// ^? = {
//          (a: string): boolean
//          (a: number): symbol
//      }

Existential intersections can (probably) be passed to parameters of a union of functions, as they only require one of their options to match (to be sanity checked):

declare const fn1: (
    | ((a: string, b: string): void)
    | ((a: number, b: number): void)
)

function fn2(x: string && number) {
    fn1(x, x); // works
}

Accesses

When looking at the examples above with the indexes on unions it may seem like an index operation generates a new existential instance. However this is actually not the case. It is the accessing of the union that generates the instance:

declare let a: { foo: A } | { foo: B }
const b = a     // <- a is read out here. this is an "access".
//    ^? = { foo: A } || { foo: B } which is equivalent to `{ foo: A || B }`

const c = b.foo // here we are just doing { foo: A || B }['foo'] which is trivially `A || B`,
                // with choice linked to `typeof b`

Think of it like this: if you have a value that was in a, then the actual real type of that value a has to be either { foo: A } or { foo: B }, it can't be both. Of course, if the variable was reassigned then the "real" type might change, because that variable itself is type |, so without any additional measures every access would have to generate a new existential instance. However, there is a way we can track the existential instance in the compiler: flow types.

Currently I am using flow types to set the initial flow type of any identifier with type A | B to an initial existential instance A || B. So the real type of any non-assignment reference to this identifier is now A || B. The control flow analysis tracks these types and since an assignment overrides the initial type with a new type, we can actually track the history of these types (within function boundaries of course):

type A = string | number | symbol

let a!: A 
let b!: A

const c = a;                 // we read out the value of a. the type of `c` is now
                             // `string || number || symbol`
let d: typeof b = b;         // we read out the value of b, and set d explicitly to the flow type
                             // of b, being a different instance of `string || number || symbol`
                             // this is because I also added a type widening rule that changes
                             // `string || number || symbol` back into `string | number | symbol`
                             // when it's assigned to a mutable variable, because otherwise it becomes
                             // confusing and unusable

if (typeof c === 'string') {
    c      // type of c here is `string`
} else {
    c      // type of c here is `number || symbol`,
           // where the choices are linked to the choices of `typeof c`
    d = c; // error! this assignment fails. typeof c is not the same type as typeof b,
           // and the choices are not linked.
}

As mentioned in the comments, keeping the flow type as the inferred type is fine for const variables, since || is a subtype of | and the only difference is in assignments, but for let declarations with an inferred existential type I widen it back to the universal version, to prevent confusion and to retain backwards compatibility in e.g. this simple case:

type A = string | number | symbol

let a!: A
let b = a
b = 5 // doesn't work

Tracking these flow types bridges | and ||, e.g. the following now works:

declare const fn1: {
    (a: string, b: string): void
    (a: number, b: number): void
}

function fn2(x: string | number) {
    fn1(x, x); // works

    const y = x
    x = 5
    fn1(x, y); // error
}

The following also works:

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare const o: Obj

o.obj[o.key]

Calling an existential union/intersection

This is rather simple to work out:

type Fn =
    && ((a: string) => void)
    && ((a: number) => void)

// is equivalent to
type Fn = (a: string || number) => void

// and can thus only be called with a fitting linked type or `string & number`

Unions:

type Fn =
    || ((a: string) => void)
    || ((a: number) => void)

// is equivalent to
type Fn = (a: string && number) => void

// and can thus only be called with a fitting linked type or `string & number`

Indexing with an existential union/intersection

This is a complex operation that needs to group the choices for the object type and the choices of the index type and run the check for each of these groups, then output a linked result type. It is essentially the same operation as with the function calls above.

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare const o: Obj

o.obj[o.key]

πŸ’» Use Cases

This suggestions fixes many issues that are all variants of the same underlying problem:

The correspondence problem

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare const o: Obj

o.obj[o.key]

With this suggestion, correspondence can be proven between o.obj and o.key and this operation no longer fails. Other manifestations of the same issues that are more common:

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare function fn2<K extends string, O extends Record<K, unknown>>(key: K, obj: O): void

function fn1(o: Obj) {
    fn2(o.key, o.obj)
}
type AProps = { a: string, b: string }
type BProps = { x: number, b: string }
type SwitchProps =
    | { Component: (props: AProps) => ReactNode, props: AProps }
    | { Component: (props: BProps) => ReactNode, props: BProps }

function SwitchComponent(Props: SwitchProps) {
    return <Props.Component {...Props.props}/>
}
const components = [
    { Component: Component1, props: { x: 5 } },
    { Component: Component2, props: { x: 'wegew' } },
]

function App() {
    return components.map(({ Component, props }) =>
        <Component {...props}/>
    )
}

Switch return problem

This problem stems from the inability to define a function with an overloaded signature without specifying an implementation signature. Often people try to do this:

type Circle = { type: 'circle', radius: number }
type Rectangle = { type: 'rectangle', width: number, height: number }
type Shape = Circle | Rectangle

function createFoo<S extends Shape>(type: S['type']): S {
    switch (type) {
        case 'circle': return { type: 'circle', radius: 0 }
        case 'rectangle': return { type: 'rectangle', width: 0, height: 0 }
    }
}

This fails because S is unbounded on the subtype side, you can't assign a concrete type to it. The problem here is that developers try to use a polymorphic variable because they want to select which type to return:

// creating this interface
type CreateFooFn = <S extends Shape>(type: S['type']): S

// while what they actually _want_ is this interface:
type CreateFooFn = {
    (type: 'circle'): Circle
    (type: 'rectangle'): Rectangle
}

With this suggestion this is possible though still rather clunky:

function createFoo(type: S['type']): Extract<Shape, { type: typeof type }> {
    switch (type) {
        case 'circle': return { type: 'circle', radius: 0 }
        case 'rectangle': return { type: 'rectangle', width: 0, height: 0 }
    }
}

If there were a hypothetical equals constraint this clunkiness goes away:

function createFoo<S equals Shape>(type: S['type']): S {
    switch (type) {
        case 'circle': return { type: 'circle', radius: 0 }
        case 'rectangle': return { type: 'rectangle', width: 0, height: 0 }
    }
}

However this is a different feature entirely.

Issues with mutations

Intersections of arrays

The following is unsound but currently allowed:

const x: { a: number }[] & { b: number }[] = []
x.push({ a: 4 })
x[0].b.toExponential() // boom

What happens here is the different push methods get combined into an overloaded function. With our suggestion this doesn't happen:

// push is instead of the type
type PushFn = ((...items: { a: number }[]) => number) && ((...items: { b: number }[]) => number)

// which is equivalent to
type PushFn = ((...items: { a: number }[] || { b: number }[]) => number)

// or
type PushFn = ((...items: ({ a: number } || { b: number })[]) => number)

So the push function must be passed a { a: number } || { b: number }, which is only possible to be satisfied with a type with fitting linked choices or with { a: number, b: number }.

Failures when through type-level operations the read type is used as a write type:

interface ColorTypes {
  BLUE: {
    source: boolean
    result: string
  }
  RED: {
    source: number
    result: boolean
  }
}

function doSomething<COLOR extends keyof ColorTypes>(
  color: COLOR,
  source: ColorTypes[COLOR]['source']
): ColorTypes[COLOR]['result'] {
  if (color === 'RED') {
    return 'not a boolean' // not as expected - should raise an error as string is not assignable to boolean
  }
}

const a = doSomething('BLUE', 2) // as expected - error since second parameter should be a boolean, not a number
const b: string = doSomething('RED', 2) // as expected - error since return type is a boolean, not a string

This occurs because ColorTypes[COLOR]['result'] returns the read type string | boolean instead of the write type string & boolean. With this suggestion the type would instead be string || boolean, which can both track the type choices and accept the more correct string & boolean.

A solution to #44373 without concessions

This problem is now worked around in the beta version of the compiler by silently coercing to the supertype (Fizz | Buzz)[].

interface Fizz {
    id: number;
    fizz: string;
}

interface Buzz {
    id: number;
    buzz: string;
}

([] as Fizz[] | Buzz[]).filter(item => item.id < 5);

This can cause unexpected errors:

interface A {
    foo: string[] | number[]
}

declare const a: A
declare const b: A

b.foo = a.foo.filter(x => x === 'etg') // error

With our suggestion this coercion is not needed:

// the new type of .filter (leaving out the thisarg parameter for brevity):
type FilterFn =
    || ((predicate: (value: string, index: number, array: string[]) => boolean) => string[])
    || ((predicate: (value: number, index: number, array: number[]) => boolean) => number[])

// equal to (the choices here are linked)
type FilterFn = (predicate: (
    && (value: string, index: number, array: string[]) => boolean
    && (value: number, index: number, array: number[]) => boolean
)) => (string || number)[]

// equal to
type FilterFn = (predicate: (value: string || number, index: number, array: (string || number)[]) => boolean) => (string || number)[]

Another way to derive this:

type Arr = string[] | number[]
type Arr = allof (string || number)[]

type FilterFn = (predicate: (value: string || number, index: number, array: (string || number)[]) => boolean) => (string || number)[]

Note that we are also kind of coercing, because an access creates an existential instance, but to the subtype side, so the resulting type can still be used as usual.

Typing constants like environment variables

One could, as an example, type something like this:

declare namespace NodeJS {
    interface ProcessEnv {
        NODE_ENV: 'dev' || 'test' || 'prod'
    }
}

const createDb = (dbName: `${typeof process.env.NODE_ENV}-${string}`) => {}

Now createDb can only be called with a database name that matches the environment passed in through NODE_ENV, whereas the type ${'dev' | 'test' | 'prod'}-${string} would allow it to be called with any valid environment.

RyanCavanaugh commented 1 year ago

I'm really not understanding how the site-specific identity is supposed to work. Looking at this function

type Fn =
    || ((a: string) => void)
    || ((a: number) => void)

// is equivalent to
type Fn = (a: string && number) => void

How do you get a value suitable to call an Fn ? How do you name that type? What's the .d.ts output of this function?

function fn(arg: () => string || () => number) {
  return arg();
}
// declaration emit is...
function fn(arg: () => string || () => number): ????? <- ?
angryzor commented 1 year ago

Looking at declaration emit is indeed something that is still a TODO on my list. I think the only fully backwards compatible method would be to only allow exports of types that are fully closed over. As soon as a || or && leaks to a declaration file it would cause breaking issues for previous compiler versions.

In general I cannot immediately think of a use case for a naked || or && type in an exported symbol in the first place.

RyanCavanaugh commented 1 year ago

Compat isn't a big deal IMO; if you use newer language features then those features may appear in the corresponding .d.ts. This has always been the case.

angryzor commented 1 year ago

Addressing your question on how to call that function: In essence, you can only call that function with a type that originates from the same source, e.g.:

type A = 
    || { x: string, f: (a: string) => void }
    || { x: number, f: (a: number) => void }

declare const a: A

a.f(a.x)
// `a.f` here has the type `(a: string && number) => void`,
// but the choice here is linked to the same source as the choice of `a.x`, namely A
// So the checker iterates at top level, choosing linked options together:
//  > pass `string` to `(a: string) => void`. can string be assigned to string ? yes
//  > pass `number` to `(a: number) => void`. can number be assigned to number ? yes

However, you can also call this function with a parameter that is valid for both options, namely string & number or more specific.

How to refer to these types is by using their full reference, which for simple references could be either calculated or kept in the existential analog of the origin type on unions. e.g. the type of a.f above is unique and can only be referred to as typeof a.f. However that is of course the simple case. How to refer to types that are the result of more complex operations such as the result of a function call as you suggest I still have to research more and will not be trivial.

But in your specific example case it is not complex:

function fn(arg: () => string || () => number) {
  return arg();
}
// declaration emit is...
function fn(arg: () => string || () => number): ReturnType<typeof arg>

The idea is exactly that || and && types are never dereferenced, because this is what causes the correspondence problem. They can be dereferenced under the hood because their choice links are tracked, but the verbalization of the type is always deferred as much as possible, showing the same behavior as type parameters currently do. Taking the correspondence problem example:

type Obj =
    | { obj: { foo: string }, key: 'foo' }
    | { obj: { bar: string }, key: 'bar' }

declare const o: Obj

const x = o.obj[o.key]

The type of x here is (typeof o.obj)[typeof o.key].

In fact in my first attempt at implementing this I just built up a chain of lazy IndexedAccessTypes in the compiler, but eventually I found that the concept fit better in the architecture of the compiler if I actually created unique instances of real types with their own unique resolvedTypes and tracked the links between them (to be checked what the memory impact is and whether it can be optimized).

fatcerberus commented 1 year ago

In essence, you can only call that function with a type that originates from the same source

Per the code example(s) I've seen in this thread, this all looks an awful lot like https://github.com/microsoft/TypeScript/issues/30581, fwiw.

angryzor commented 1 year ago

In essence, you can only call that function with a type that originates from the same source

Per the code example(s) I've seen in this thread, this all looks an awful lot like #30581, fwiw.

Interesting. This is indeed the correspondence problem or very similar at least. It looks like the PR that closed that ticket ( #47109 ) solved the correspondence problem for a specific encoding (outlined in the PR description) that explicitly stores the correspondence in a record type (essentially explicitly storing the "links between choices" that I am tracking).

I guess my suggestion is attempting to find a more general solution by having the type system manage these links, while also addressing other inconsistencies and other requested features, like being able to call an overloaded function with a union.

kyledevans commented 1 year ago

This sounds like it would be extremely useful for React developers (not sure about other frameworks) when trying to implement polymorphic components. I recently spent some time trying to work out how to accomplish this, and the best I could come up with was ugly and only produced a component type signature that worked 1 level deep. So any complex types (such as a parameter list in a function) would be converted to "any".

If it's at all helpful, here's some relevant links: