instanceof narrowing should preserve generic types from super to child type

Nathan-Fenner commented 5 years ago

Search Terms

instanceof generic
narrow instanceof

Suggestion

The following code has a rather unexpected behavior today:

class Parent<T> {
    x: T;
}
class Child<S> extends Parent<S> {
    y: S;
}

function example(obj: Parent<number>) {
    if (obj instanceof Child) {
        const child = obj;
    }
}

The narrowed type that child gets is Child<any>.

I'm requesting that the type instead gets narrowed to Child<number>, which is what I originally assumed would happen.

My proposal is, in general, to infer the type arguments to the narrowed-to class type whenever they can be determined from the original type.

Precise Behavior

Consider the following code today:

const x: P = ...;
if (x instanceof C) {
    x;
}

Today, if C is a generic class, x gets the narrowed type C<any, any, any, ...>; otherwise it just gets the type C.

My proposal is to consider the following piece of code, and use it to inform

Imagine that C has a no-argument constructor. Then the following piece of code is valid today:

function onlyP(c: P) { ... }

onlyC(new C());

in order to make it valid, the compiler infers type arguments for C that make it into a subtype of P. My proposal is to use the same strategy to infer the type arguments for C in an instanceof narrowing.

Consider the following examples today, and their corresponding instanceof narrowings:

function ex1(x: Parent<string>) { }
ex1(new Child()); // inferred arguments: <string>

function ex2(x: Parent<number | string>) { }
ex2(new Child()); // inferred arguments: <number | string>

function ex3(x: Parent<number> | string) { }
ex3(new Child()); // inferred arguments: <number>

function ex4(x: Parent<number> | Parent<string>) { }
ex4(new Child()); // inferred arguments: Child<number | string>
// Note: the above errors, because it Child<number|string> actually fails to be a subtype of
// Parent<number> | Parent<string>.
// We can either choose to infer Child<any> in this case, or use the (incorrect, but more-precise)
// inference Child<number | string>.

Examples

The original use-case I had in mind was roughly the following:

abstract class Obtainer<T> {
  __phantom: T = null as T;
}

abstract class Fetcher<T> extends Obtainer<T> {
  public abstract fetch(): Promise<T>;
}

abstract class Dependency<T, D> extends Obtainer<T> {
  public abstract dependencies(): Obtainer<D>;
  public abstract compute(got: D): T;
}

async function obtain<T>(obtainer: Obtainer<T>): Promise<T> {
  if (obtainer instanceof Fetcher) {
    // obtainer: Fetcher<T>
    // currently, it's a Fetcher<any>
    return await obtainer.fetch();
  }
  if (obtainer instanceof Dependency) {
    // obtainer: Dependency<T, any>
    // currently, it's a Dependency<any, any>
    const dependencies = obtainer.dependencies();
    return obtainer.compute(await obtain(dependencies));
  }
  throw new Error("not implemented");
}

(note: there's still one extraneous any in the above, since the D parameter cannot be inferred)

Checklist

My suggestion meets these guidelines:

[ ] This wouldn't be a breaking change in existing TypeScript/JavaScript code
- This is a breaking change. However, I think it could only really break code that was already broken. If really necessary, it could be put behind a new --strictGenericNarrowing flag.
[x] This wouldn't change the runtime behavior of existing JavaScript code
- No change to runtime; just inferred generic types.
[x] This could be implemented without emitting different JS based on the types of the expressions
[ x This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, etc.)
[x] This feature would agree with the rest of TypeScript's Design Goals.
- I think this is an easy win for a more sound type system without any negative impact

Also, I am interested in contributing this change if it's approved!

weswigham commented 5 years ago

In general, you can't just float a generic parameter from a supertype to a subtype arbitrarily (not based on type argument position, anyway), however we might be able to reuse inference to do better here, by inferring from the supertype to the subtype and using that as the narrowed value. This could still leave some arguments unfulfilled, as in your example, which would still need to fall back to any/their constraints.

Nathan-Fenner commented 5 years ago

In general, you can't just float a generic parameter from a supertype to a subtype arbitrarily (not based on type argument position, anyway), however we might be able to reuse inference to do better here, by inferring from the supertype to the subtype and using that as the narrowed value.

Yes, this is specifically what I want to do, especially since it reuses (and therefore could potentially benefit from future improvements to) the regular subtype inference implementation.

alexgorbatchev commented 5 years ago

I think this should be fixed as proposed and put behind a flag until 4.x

thw0rted commented 5 years ago

Why would it make sense to assume that the supertype and subtype take the same generic arguments? It might be a relatively common pattern but it's no more required than for a class to have the same constructor arguments as its parent.

Nathan-Fenner commented 5 years ago

@thw0rted please read the existing conversation (and see the provided implementation)

When x: T, in order to narrow

if (x instanceof SomeClass) {
    // ...
}

when SomeClass is generic, with parameters SomeClass<A, B, C>, the goal is to infer types A, B, and C as wide as possible such that SomeClass<A, B, C> remains a subtype of T.

The fact that they're often the same as the parent is a common case, but all others are also handled by the above.

thw0rted commented 5 years ago

It took me a few tries but I think I get it now. It's not a matter of blinding carrying parameters over, it's a matter of seeing that the child's generic type parameters are used as the generic type values for the parent class. I just hadn't run into this specific issue previously, though it's probably related to #17473, which is how I got here. I wonder if the proposed fix can do anything to help my issue, where generic parameters should be constrained by the declaration's extends keyword?

millsp commented 4 years ago

I got a working functional implementation of this https://tsplay.dev/KwXKJW

update

Nathan-Fenner commented 4 years ago

@millsp see #30161 for an overview of why the existing type system is not quite expressive enough to actually solve this. In particular, the inferred types that the linked instanceOf helper produces are not precise enough to be useful. Here are a few test cases demonstrating its limitations:

class Parent<T> {
  t!: T; 
}

class Child<T> extends Parent<T> {
  a!: T;
  b!: T;
}

function example(p: Parent<number>) {
  if (p instanceof Child) {
    // p is now known to be a Parent<number>, and also a Child<T> for some T.
    // Which T could be a valid substitution? We know that T <: number,
    // but nothing else. So the "correct" type to ascribe to p is now

    // p: exists<T extends number> Child<T>

    // which can also be thought of as a union over all possible T extending number, which is equivalent
    // in this case to Child<number>.

    p; // should be Child<number>
  }
}

Similarly, in the other way, if we have

class Parent<A, B> {
  a!: A;
  b!: B;
}

class Child<T> extends Parent<T, T> {
  t!: T;
}

function example1(p: Parent<number, number>) {
  if (p instanceof Child) {
    p; // should be Parent<number, number> & Child<number>
  }
}

function example2(p: Parent<number, string>) {
  if (p instanceof Child) {
    p; // should be Parent<number, string> & Child<string | number>
  }
}

Extra precision is obtained from the constraint that the type parameters for the subclass type must be chosen such that the instance remains an inhabitant of the static supertype (e.g. in example1, it could not possibly be that p.t === false, since that would require that it was a Child<boolean | Something>, and that would mean that it's also a Parent<boolean | Something, boolean | Something>. But we know from example1's argument signature that it's not, and therefore the child cannot be either.

Note that in all three examples, the linked instanceOf function assigns the wrong type to the narrowed p, using unknown where a more-precise (but still sound) parameter should be used instead.

This probably can probably only be resolved neatly with the introduction of an existential type operator exists<T extends C> S that implicitly sums over all possible T satisfying the constraint.

At the very least, handling instanceof narrowing in a general, precise, and sound manner would effectively require implementing such a type, even if it was "lowered" down to TypeScript's slightly-less-expressive type syntax before being exposed to the rest of inference/user types.

thw0rted commented 4 years ago

I'm just trying to follow along from the cheap seats, but shouldn't the type of p inside the guard in example2 actually be never? If the generic type values passed to Parent in a Child must be the same, and the formal parameter for the function has them as different (non-overlapping), that's enough information for a "perfect" type checker to determine that p can never be an instance of Child.

Nathan-Fenner commented 4 years ago

@thw0rted I agree that it's very counterintuitive, but it turns out that the types do overlap in this case. The reason is that there appears to be a chain of legal subtype casts that make this happen.

Here's an example of an instance of Child, stripping away the prototype, constructor, methods, etc., which could affect variance but we'll pretend everything is largely covariant here: { t: 'hi', a: 'bye', b: 5 }. This clearly "is" both a Child<string | number> and also a Parent<string, number> and also a Parent<string | number, string | number>, because it has all of the properties that each of those types require, which is all we really mean when we say "x has type T".

It also turns out in this case that the most general sound type we can assign says exactly that: it is both a Child<string|number> and also a Parent<string, number>, so it is a Child<string|number> & Parent<string, number>. The type annotations tell us what we're allowed to do with an object; so there's not always some "best" type (e.g. is a {x: 4, y: "hi"} an {x:number,y:string} or is it a {x:4,y:"hi"} or is a {x:string|number,y:string|number}? The answer is: it's essentially all of them at once. The annotation tells you what you can do with it and what you can expect from it, but not what it is.

The benefits of a fully general existential type approach is that you'd be able to largely side-step these complexities. You'd immediately be able to narrow it down to something like Parent<string, number> & exists<T where Child<T> extends Parent<string, number>> Child<T> which is crude and complicated looking, but definitely correct.

Then you'd have the task of trying to simplify away that scary-looking type, which hopefully could be built on some nice theory instead of ad-hoc patches to inference.

millsp commented 4 years ago

This is also called excess properties in the docs.

thw0rted commented 4 years ago

Thanks Nathan, I hadn't considered the case of a union type for T but in hindsight it's obvious.

microsoft / TypeScript