microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.29k stars 12.39k forks source link

Types `number` and explicitly constrained `T extends unknown` shouldn't be comparable #32814

Open DanielRosenwasser opened 5 years ago

DanielRosenwasser commented 5 years ago
const num = 1;
function check<T extends unknown>(x: T) {
  return x === num;
}
check(num);

Expected: Error: This condition will always return 'false' since the types 'T' and '1' have no overlap. Actual: No error.

Contrast this with the following example from #32768.

const num = 1;
function check<T>(x: T) {
  return x === num;
}
check(num);
AnyhowStep commented 5 years ago

I would expect both the explicit and implicit extends unknown to be comparable to 1.

It's sound that if x === num, then you can narrow x to 1. Is there a reason to not want to allow that behaviour?


If it's disallowed, does that mean the code would then have to be,

const num = 1;
function check<T extends unknown>(x: T) {
  return typeof x === "number" && x === num;
}
check(num);

to work?

ilogico commented 5 years ago

I suppose the following shouldn't be allowed either:

function check<T extends string | number>(x: T) {
  return x === 1;
}

The problem is not specific to extends unknown.

davidje13 commented 5 years ago

This seems backwards to me; it's clear that "This condition will always return 'false'" is a lie. In the example, it will return true!

The current <T extends unknown> behaviour is the correct one. The <T> behaviour is wrong.

jack-williams commented 5 years ago

@davidje13 The problem with that approach is that any two unrelated type parameters become comparable, which is a bug in many cases.

I'll probably wait on the design meeting before looking more into this --- it's not quite clear what the new behaviour should be. The code-path I linked on the other issue is the one influencing the unconstrained case, which means that the implicit constraint of T in <T> is the empty object type.

There may be more options I don't know about.

DanielRosenwasser commented 5 years ago

For what it's worth, there was a suggestion that "any two type parameters should be strictly rejected when comparing them", but the fact that that wouldn't cover keyof T being comparable to T is part of why that's not sufficient (because it's still surprising).

davidje13 commented 5 years ago

any two unrelated type parameters become comparable, which is a bug in many cases

It feels like you're trying to shoehorn code quality checks into a type system. If you make it behave this way, you will be breaking the correctness of the type checking and you'll be chasing weird edge cases throughout the system (not to mention forcing the use of any upon anybody who actually needs to perform these kinds of comparison). While this change may find the specific bad style of comparing 2 different generic variables, it is actually wrong from a types perspective.

To be clear, it is obvious that function isSame<T, U>(t: T, u: U) { return t === u; } can return true or false. The comparison is legitimate, and has potential value. From the type system's perspective, this is perfectly valid code (although, apparently, not right now unless you explicitly add extends unknown).

Of course, TypeScript should catch function foo<T, U>(t: T, u: U): U { return t; }. Because although T's unknown does overlap with U's unknown, neither extends the other. Comparison will always be possible between them, but assignment is not possible in either direction without further type narrowing. Note that when using extends unknown, the type system is already able to achieve this correctly.

jack-williams commented 5 years ago

It feels like you're trying to shoehorn code quality checks into a type system. If you make it behave this way, you will be breaking the correctness of the type checking and you'll be chasing weird edge cases throughout the system (not to mention forcing the use of any upon anybody who actually needs to perform these kinds of comparison).

I don't know what breaking the correctness of type checking means in this context, but it would not force users to use any. All values are comparable at type unknown.

function compare(x: unknown, y: unknown) {
    return x === y;
}

function test<T,U>(t: T, u: U) {
    if (compare(t,u)) {

    }
}

While this change may find the specific bad style of comparing 2 different generic variables, it is actually wrong from a types perspective.

There are plenty of languages that do not let you compare arbitrary generic types, so I'm not sure what you mean by 'wrong'.

davidje13 commented 5 years ago

@jack-williams see the original issue #32768 for my actual use-case which kicked this off.

There's clearly some miscommunication here because both of us think our view is the only possible logical interpretation. So to maybe clarify my perspective:

function compare<T, U>(t: T, u: U) {
  return t === u;
}

What we know before the comparison:

What we additionally know if the comparison is true:

What we additionally know if the comparison is false:

Clearly the false case is possible, and I don't think that's disputed.

The true case has the following combined type assertions:

Which resolves to saying that ∃ S : S ⊆ (TU), where S ≠ ∅ (above, S would be typeof t or typeof u). Or more simply, (TU) ≠ ∅

At this point, we do not have enough information to formally decide this one way or the other; (TU) might be the empty set, or it might not. Therefore, this branch might be possible, or might not. The undecidability means that we cannot possibly claim it "will always return 'false'", though that in itself could be fixed by just changing the error wording.

What I mean by "breaking the correctness of the type checking" is exactly this; if we choose to assume undecidable states cannot happen, we are no-longer modelling how the program can actually operate (i.e. the real behaviour it may exhibit). The only way to continue modelling how the program can actually operate is by assuming that undecidable states can happen, until we have enough information to prove that they cannot happen (at which point they cease being undecidable).

To hammer the point home, consider the closely related code:

function compare(t: string | number, u: string | number) {
  return t === u;
}

Once again, typeof t and typeof u might have no overlap (i.e. one may be a string and the other a number), but clearly we want to allow the comparison (and currently do allow it). The only difference here is the level of indirection; in this example the uncertainty is with (typeof ttypeof u) ≟ ∅, whereas before it was a level higher. Going a level deeper again, we start looking at literal types and find ourselves defining the === operator itself as (typeof ttypeof u) ≠ ∅ (i.e. it is defined as a runtime test for the region of undecidability). Doing differently at higher levels of indirection introduces an asymmetry in the type system.

RyanCavanaugh commented 5 years ago

Useful, consistent, correct: Pick any two 😅

RyanCavanaugh commented 5 years ago

To be clear, it is obvious that function isSame<T, U>(t: T, u: U) { return t === u; } can return true or false. The comparison is legitimate, and has potential value.

An additional desirable symmetry is that a generic function should be type checked such that an error is issued if any concrete version of the function would have errored. In this case, we can easily construct an erroring instatiation isSame<string, number>:

function isSame(t: string, u: number) { return t === u; }

Moreover, you have many options when writing isSame to decide how you want both the caller and the implementation to be checked. If you want to take any two values and compare them freely in the body, unknown, unknown is the proper signature. Choosing to write this function with type parameters should imply some other behavior because providing different behavior is why different syntax exists.

AnyhowStep commented 5 years ago

error is issued if any concrete version of the function would have errored.

That's actually a really good argument. And now that I'm thinking about it that way, I'm having trouble coming up with an example where T extends unknown should allow arbitrary comparisons.

The original issue's intent would have been better written as,

const isNum = (v: unknown) => {
  if (typeof v !== 'number') throw new Error('Not a number');
  return v;
};
const isString = (v: unknown) => {
  if (typeof v !== 'string') throw new Error('Not a string');
  return v;
};

function makeExampleValue<T extends number|string>(validator: (v: unknown) => T): T {
  if (validator === isNum) return 42 as T;
  if (validator === isString) return 'something' as T;
  //Not really "unknown type".
  //More like, "unknown validator", since we're doing function comparisons
  throw new Error('unknown type');
}

Playground

AnyhowStep commented 5 years ago

Is the following behaviour intentional?


function check<T extends string|number> (t : T) {
    //This is okay, even though the instantiation of `T`
    //with `string` should make it non-comparable.
    //But becaues we are comparing a generic to a non-generic,
    //we seem to use the constraint `string|number`
    //and check if it is comparable to the non-generic `number`
    return t === 1;
}
function check2<T extends string|number, U extends number> (t : T, u : U) {
    //Doesn't work, though.
    //Cannot compare generic type `T` and `U`
    //because `T` could be `string`
    //and `U` could be `number`
    return t === u;
}

Playground

[Edit] Changed U extends string|number to U extends number


[Edit] The error message could be more descriptive,

This condition will always return 'false' since the types 'T' and 'U' have no overlap.

Should be more like,

This condition may always return 'false' since the types 'T' and 'U' may have no overlap.

Would be even better if it could say,

This condition may always return 'false' since the types 'T' and 'U' may have no overlap.

T may be string and U may be number, which do not overlap.

fatcerberus commented 5 years ago

From a purely theoretical standpoint:

function foo<T>(x: T): boolean
{
    return x === 1;  // should this be an error?
}

foo is universally quantified over T. i.e., for all types T, there is a function foo which accepts a value of type T and returns a boolean.

However, inside the function, T is existentially quantified: the implementation can be written assuming it represents a single concrete--but unknown--type. Therefore it's tempting to also assume the comparibility relation (the question of whether x === 1 is kosher) should satisfy the logical statement:

∃T, Comparible(T, number)

But this is not necessarily sound. Because the function as a whole is universally quantified, x === 1 can't be assumed to be sound until you've satisfied the stronger constraint of:

∀T, Comparible(T, number)

Which seems to be what @RyanCavanaugh is getting at above (minus the nerdiness 🤓).

davidje13 commented 5 years ago

@RyanCavanaugh I understand your viewpoint but I disagree. isSame<string, number> is a specific use-case of isSame where, yes, the branch would be useless. But that doesn't mean there are no occurrences of isSame<string, string> elsewhere in the code where the branch is useful.

@fatcerberus that is true for assignments or casts, etc. but not for comparison. The equality operator is, after all, able to compare anything with anything else, so even 1 === 'hello' is ok at runtime. That's why this becomes a question of reachability rather than type comparability (with comparability used as a stand-in for determining reachability), hence the observed (incorrect) error message talking about "always false".

fatcerberus commented 5 years ago

@davidje13 Just for the record: image

So by your argument, the above illustrated error needs to be removed too.

davidje13 commented 5 years ago

@fatcerberus no that's not what I'm saying (I updated my comment to add "at runtime" just as you posted that to clarify that exact point)

My point is that type-wise, the check is fine, but in that case the error is 100% correct about the reachability. It can be proven at compile time that the check will, guaranteed, be false every time it runs. That makes the check pointless, and TypeScript is correct to produce an error.

My comment about 1 === 'hello' being ok at runtime was just to point out that the operator itself has no qualms about the types of its inputs (i.e. it won't throw if given incomparable types).

The discussion here is about what to do when it cannot be determined at compile time whether it will always be true, or always be false

jack-williams commented 5 years ago

@davidje13

FWIW I hadn't really expressed my view on the matter, I was just trying to lay out the consequences of the alternatives.

What I mean by "breaking the correctness of the type checking" is exactly this; if we choose to assume undecidable states cannot happen, we are no-longer modelling how the program can actually operate (i.e. the real behaviour it may exhibit). The only way to continue modelling how the program can actually operate is by assuming that undecidable states can happen, until we have enough information to prove that they cannot happen (at which point they cease being undecidable).

If I understand your description correctly you are describing completeness of type-checking, but this is not a property that most type-checkers ever try to satisfy---so I disagree that this is breaking any correctness criteria that TypeScript should ever realistically aspire to.

fatcerberus commented 5 years ago

I guess what I most don't understand is why these functions, as written, are generic. Maybe it's just the danger of simplified repros, but since T isn't actually used for anything other than the type of the argument, I don't see why the functions in question can't just be written:

function foo(x: unknown): boolean
{
    return x === 1;  // no error
}

function isSame(x: unknown, y: unknown): boolean
{
    return x === y;  // also no error
}

Since you only care about equality here and not exactly what the types involved are, there's no reason to take a type parameter at all. You can use the constraint directly, whatever it happens to be. Generics are mainly for when you need to use the type parameter to more precisely type something else (such as the return value), which doesn't seem to be happening here.

Of course it's probably just that the real case is more complex and really does need to be generic, in which case this post would just be noise. If so I apologize--just want to cover all the bases.

AnyhowStep commented 5 years ago

@davidje13

The problem here is an issue of "caller" vs "implementer" and what you know vs what you don't know.


To the caller, <string, number> is a perfectly valid instantiation because the function implementation is treated as a black box. It doesn't know the the implementation is trying to compare the types.

If it knew, it would give you errors, too.


To the implementer, <string, number> is an invalid instantiation because if it is always instantiated with <string, number>, then the types are not comparable because there is no overlap.

It doesn't know that <string, number> will never be instantiated. It just knows that it is possible, and that if it is instantiated, there's something wrong.


That isSame would have been better as,

function isSame<T, U extends T>(t: T, u: U) {
  return t === u;
}

This way, it doesn't matter what T and U are, there will always be overlap (because U is a subtype of T)

davidje13 commented 5 years ago

I think this will be my last comment on this issue, because it doesn't affect me too strongly (I can always use as any to get around any errors introduced), but to give a more concrete (but still contrived) example, since I think a lot of people are getting hung-up on the simplicity of isSame:

function mapReversed<T, U>(source: T[], target: U[], map: (t: T) => U): void {
  const count = source.length;
  if (target.length !== count) {
    throw new Error('target length mismatch');
  }
  if (source === target) {
    // cannot reverse in-place, so perform extra steps
    // (uses more memory, builds up garbage for the gc to handle, and is slower, but guarantees correctness)
    const mapped = source.map(map);
    for (let i = 0; i < count; i++) {
      target[count - i - 1] = mapped[i];
    }
  } else {
    // reversing in-place is fine; use optimised code (no garbage generated for gc)
    for (let i = 0; i < count; i++) {
      target[count - i - 1] = map(source[i]);
    }
  }
}
fatcerberus commented 5 years ago

Ah right, it’s an optimization in a larger generic function. Yeah, optimizations like that are generally one case where you have to go over the type system’s head at some point—speaking from experience.

Probably a case for // @ts-ignore (which I would prefer over an any cast here).

AnyhowStep commented 5 years ago

~And how would you call mapReversed<string, number>(myStringArray, myStringArray, toNumber)?~

[Edit] Disregard this, there's no sane reason to ever want to call the above. Whatever @fatcerberus said.

RyanCavanaugh commented 5 years ago

Everyone's raised some good points and I really appreciate the discussion here. 👍

MicahZoltu commented 4 years ago

I think this is the same problem (more specifically, the same as #32768 which is a duplicate of this):

function isEmpty<T>(thing: T, key: keyof T) {
    // This condition will always return 'false' since the types 'T[keyof T]' and 'string' have no overlap.
    return thing[key] === ''
}

Unfortunately, the workaround (which it sounds like is a bug) of T extends unknown doesn't work here.