microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
100.37k stars 12.41k forks source link

[Proposal] Type assertion statement (type cast) at block-scope level #10421

Open yahiko00 opened 8 years ago

yahiko00 commented 8 years ago

This is a proposal in order to simplify the way we have to deal with type guards in TypeScript in order to enforce the type inference.

The use case is the following. Let us assume we have dozens (and dozens) of interfaces as the following:

Code

interface AARect {
    x: number; // top left corner
    y: number; // top left corner
    width: number;
    height: number;
}

interface AABox {
    x: number; // center
    y: number; // center
    halfDimX: number;
    halfDimY: number;
}

interface Circle {
    x: number; // center
    y: number; // center
    radius: number;
}

// And much more...

And we have a union type like this one:

type Geometry = AARect | AABox | Circle | // ... And much more

It is quite easy to discriminate a type from another with hasOwnProperty or the in keyword:

function processGeometry(obj: Geometry): void {
    if ("width" in obj) {
        let width = (obj as AARect).width;
        // ...
    }
    if ("halfDimX" in obj) {
        let halfDimX = (obj as AABox).halfDimX;
        // ...
    }
    else if ("radius" in obj) {
        let radius = (obj as Circle).radius;
        // ...
    }
    // And much more...
}

But, as we can see, this is quite burdensome when we need to manipulate obj inside each if block, since we need to type cast each time we use obj.

A first way to mitigate this issue would be to create an helper variable like this:

    if ("width" in obj) {
        let helpObj = obj as AARect;
        let width = helpObj.width;
        // ...
    }

But this is not really satisfying since it creates an artefact we will find in the emitted JavaScript file, which is here just for the sake of the type inference.

So another solution could be to use user-defined type guard functions:

function isAARect(obj: Geometry): obj is AARect {
    return "width" in obj;
}

function isAABox(obj: Geometry): obj is AABox {
    return "halfDimX" in obj;
}

function isCircle(obj: Geometry): obj is Circle {
    return "radius" in obj;
}

// And much more...

function processGeometry(obj: Geometry): void {
    if (isAARect(obj)) {
        let width = obj.width;
        // ...
    }
    if (isAABox(obj)) {
        let halfDimX = obj.halfDimX;
        // ...
    }
    else if (isCircle(obj)) {
        let radius = obj.radius;
        // ...
    }
    // And much more...
}

But again, I find this solution not really satisfying since it still creates persistent helpers functions just for the sake of the type inference and can be overkill for situations when we do not often need to perform type guards.

So, my proposal is to introduce a new syntax in order to force the type of an identifier at a block-scope level.

function processGeometry(obj: Geometry): void {
    if ("width" in obj) {
        assume obj is AARect;
        let width = obj.width;
        // ...
    }
    if ("halfDimX" in obj) {
        assume obj is AABox;
        let halfDimX = obj.halfDimX;
        // ...
    }
    else if ("radius" in obj) {
        assume obj is Circle;
        let radius = obj.radius;
        // ...
    }
    // And much more...
}

Above, the syntax assume <identifier> is <type> gives the information to the type inference that inside the block, following this annotation, <identifier> has to be considered as <type>. No need to type cast any more. Such a way has the advantage over the previous techniques not to generate any code in the emitted JavaScript. And in my opinion, it is less tedious than creating dedicated helper functions.

This syntax can be simplified or changed. For instance we could just have : <identifier> is <obj> without a new keyword assume, but I am unsure this would be compliant with the current grammar and design goals of the TypeScript team. Nevertheless, whatever any welcomed optimization, I think the general idea is relevant for making TypeScript clearer, less verbose in the source code and in the final JavaScript, and less tedious to write when we have to deal with union types.

DanielRosenwasser commented 8 years ago

Technically we could just consider in as a form of type guards.

But I can still imagine something that says "assume the type of this entity is so-and-so for this block".

yahiko00 commented 8 years ago

That would be nice if the in keyword could be a form of type guard. Although, I am unsure it would be easy to handle such a following case with type inference only:

interface A {
    x: number;
    z: number;
}

interface B {
    y: number;
    z: number;
}

interface C {
    x: number;
    y: number;
}

type Union = A | B | C;

function check(obj: Union): void {
    if ("x" in obj && "z" in obj) {
        // obj is now treated as an instance of type A
    }
    else if ("y" in obj && "z" in obj) {
        // obj is now treated as an instance of type B
    }
    else if ("x" in obj && "y" in obj) {
        // obj is now treated as an instance of type C
    }
}

If such a case with combination of in cannot be handled in a short term, the single discriminating in as a form of type guard would be nice though for many use cases. If combination of in can be handled anytime soon, the syntax assume <identifier> is <type> would still be relevant in general cases since it is a factored version of inline type cast.

yortus commented 8 years ago

This looks like the same concept as #9946, but with different syntax.

yahiko00 commented 8 years ago

@yortus Thanks for the reference. Yes, I am glad I am not the only one who felt the need of such a concept even though the previous proposal insisted on type narrowing. What I propose is more than type narrowing, but type overcasting. For instance, let us assume a variable obj has a type A, whatever this type is at a given point of the code. We would be able force its type to B with the new syntax (or another, I don't really care),

let obj: A; // Type of obj is A
// ...
assume obj is B; // Type of obj is B
// ...

This is would not only apply to type narrowing, but also to type casting. As said before, we could consider this proposal as a way to type cast an identifier at a block-scope level. Thinking of that, I am wondering if this could not be better to use as instead of is in order to underline the connection between this proposal and inline type cast: assume obj as B;

yortus commented 8 years ago

So what you are really proposing is a type assertion statement that is otherwise exactly like the existing type assertion expression. For example:

function fn(arg: P) {...}

// Type assertion as an expression (existing syntax)
let x: Q;
f(x as P); // 'x as P' is a type assertion expression

// Type assertion as a statement (proposed new syntax)
let y: Q;
assume y is P; // this is a type assertion statement
f(y); // y has type P here

As with other statement/expression equivalences, this proposed new form of type assertion is used for its side-effects rather than its result, although obviously these are compile-time side-effects, since type assertions are elided at runtime. And as you mention you want the side-effects limited to the block in which the assertion appears.

yahiko00 commented 8 years ago

@yortus I like your distinction between statement and expression. So yes, type assertion statement is probably the best way to name this proposal.

Syntactic possibilities:

1. As suggested in #9946: declare <identifier>: <type>. The use of keyword declare is interesting since it already exists, and emphasizes on the fact this is an annotation that will not be emitted in the JavaScript. However, it makes me think more of ambient declarations and could suggest we do not have an instanciation of <identifier> before, which is misleading. Also, I am quite afraid of conflicts with existing use cases of declare.

2. My primary proposal: assume <identifier> is <type>. It adds a new keyword to existing ones which can be seen as undesirable. However, I think we need a new keyword to express this new concept of type assertion statement. I first suggested the keyword is to introduce the type, since I had type guards in mind at the begining. But in fact, it is less related to type guard than type assertion (type cast).

3. Type assertion statement syntax: assume <identifier> as <type>. This my favorite syntax up to now. The as keyword, instead of is, underline the connection with its inline counterpart, the type assertion expression.

4. Short syntax A: <identifier> is <type> This syntax was suggested in my original proposal but it looks like more than an expression than a statement. And such a syntax could be more relevant for future proposals where type guards are directly involved.

5. Short syntax B: <identifier> as <type> This syntax would lead to a conflict with type assertion expression. I do not think the current grammar to be able to distinguish a type assertion expression from a type assertion statement written this way, since it is today perfectly valid to write a type assertion expression like the following code:

let obj: any;
obj as number; // Type assertion expression

So my preference goes for syntax (3): assume <identifier> as <type>

Extra notes

let obj: A; // Type of obj is A
...
assume obj as B; // Type of obj is B
...
assume obj as C; // Type of obj is C
...
yortus commented 8 years ago

For reference, the spec on type assertion expressions (section 4.16) is here.

The rules and use-cases described there for type assertions would presumably apply to this proposal in exactly the same way.

yortus commented 8 years ago

BTW I see you're only proposing this to work on simple identifiers, whereas type assertion expressions work on arbitrary expressions, like foo.bar as string. Type guards also work with properties, like if (isString(foo.bar)) {foo.bar.length}. It might be useful to consider at least some expressions as valid in type assertion statements, for example assert foo.bar as string;

yahiko00 commented 8 years ago

In my mind, <identifier> included both simple variable identifiers and properties. But after checking the grammar specifications, I should have been more precise. ;) So yes, both identifiers and properties should be allowed in type assertion statements.

aluanhaddad commented 8 years ago

But again, I find this solution not really satisfying since it still creates persistent helpers functions just for the sake of the type inference and can be overkill for situations when we do not often need to perform type guards.

@yahiko00 User defined type guard functions actually are an important part of the emitted code. They may determine control flow at runtime. Also this is not type inference, it is type assertion (or type assumption :stuck_out_tongue: ). If you want shorthand type guards you can write

function is<T>(value, condition: boolean): value is T {
    return condition;
}

then you can write the following

function processGeometry(obj: Geometry): void {
    if (is<AARect>(obj, "width" in obj)) {
        let width = obj.width;
        // ...
    }
    if (is<AABox>(obj, "halfDimX" in obj)) {
        let halfDimX = obj.halfDimX;
        // ...
    }
    else if (is<Circle>(obj, "radius" in obj)) {
        let radius = obj.radius;
        // ...
    }
}
yahiko00 commented 8 years ago

I do not deny the usefulness of type guard functions. But there are cases where I do not want type guards functions. First, as expressed before, because it adds verbosity where it is not needed. Your workaround above is interesting but adds a layer of abstraction, which is less clear than a plain test if("width" in obj), in my opinion. You still need a persistent helper function to perform type inference. Said in a different manner, I want the language to help me, not being on my way. Also, even if I know performance is not the priority of the TypeScript language, unfortunately, I feel better when I can minimize function calls in interpreted languages, especially in situations where speed is important, like in video games.

aluanhaddad commented 8 years ago

Fair enough. if("width" in obj) could perhaps be a type guard in a future version of the language but the inferred type could not really be AARect. Rather it would have to be something like { width: any }.

yortus commented 8 years ago

@aluanhaddad if("width" in obj) could narrow obj to AARect if the declared type of obj is a union like AARect | AABox | Circle.

aluanhaddad commented 8 years ago

@yortus that's an excellent point

basarat commented 8 years ago

Consider using a discriminated union with a type literal https://basarat.gitbooks.io/typescript/content/docs/types/discriminated-unions.html 🌹

yahiko00 commented 8 years ago

That is true. But I see a drawback to this approach since it adds an helper property which could be bigger in size (bytes) than the sum of the others. If I have to manipulate thousands or tens thousands of object, this is a luxury I cannot afford in some situations. Also, it is not always possible to use an helper property if structures behind interfaces do not implement such property and we do not have the possibility (lack of rights or access to the source code) to do so.

basarat commented 8 years ago

FWIW I asked for this a long time ago but once we got let/const I've just done the following (before discriminated unions):

interface AARect {
    x: number; // top left corner
    y: number; // top left corner
    width: number;
    height: number;
}

interface AABox {
    x: number; // center
    y: number; // center
    halfDimX: number;
    halfDimY: number;
}

interface Circle {
    x: number; // center
    y: number; // center
    radius: number;
}
type Geometry = AARect | AABox | Circle; // ... And much more

function processGeometry(obj: Geometry): void {
    if ("width" in obj) {
        let objT = obj as AARect;
        let width = objT.width;
        // ...
    }
    if ("halfDimX" in obj) {
        let objT = obj as AABox;
        let halfDimX = objT.halfDimX;
        // ...
    }
    else if ("radius" in obj) {
        let objT = obj as Circle;
        let radius = objT.radius;
        // ...
    }
    // And much more...
}

:rose:

RyanCavanaugh commented 8 years ago

Logged #10485 because we always prefer to just have narrowing happen automatically.

We've wanted a syntactic space for this for a while but haven't found anything that isn't ugly or looks like an expression with side effects.

mindplay-dk commented 6 years ago

All the current suggestions seem to propose a means of altering the type within the current scope, e.g.:

if (a instanceof A) {
    b is B;
    c is C;
    do_something(a, b, c);
}

How about making the scope of the type-cast explicitly block-scoped? Nope! see comments below.

if (a instanceof A) {
    using (b as B, c as C) {
        do_something(a, b, c);
    }
}

If I had to pick from the options listed by @yahiko00, my favorite is (1):

The use of keyword declare is interesting since it already exists, and emphasizes on the fact this is an annotation that will not be emitted in the JavaScript.

👍

However, it makes me think more of ambient declarations and could suggest we do not have an instanciation of <identifier> before, which is misleading.

That was actually the thing that gave me the same idea - in fact, I was intuitively just trying it when I ran into this problem, hoping it would just work.

I like the fact that this resembles ambient declarations, because it is ambient (doesn't emit any code) and it is a declaration - even if it isn't quite an "ambient declaration" in the Typescript sense, it seems pretty intuitive.

Also, I am quite afraid of conflicts with existing use cases of declare.

Unlikely to happen, I think - since ambient declarations are always top-level declarations?

Regarding (3) I find the introduction of a keyword assume is sort of inconsistent with other type-casts, which don't require a keyword.

Even if (5) is visually ambiguous with other type-casts, I like that option as well, as it's the closest relative to other type-casts in Typescript. It's actually not inconsistent with JS itself, where, for example, stand-alone expressions like a && b;, even if they're completely inert, are syntactically valid.

aluanhaddad commented 6 years ago

I don't think it would be wise for there to be a surrounding block since that would alter the meaning of let and const declarations.

It would be safer to scope the effect of the assertion to the enclosing block.

a && b; sounds good at first but it could have side effects so it's not inert like, say, a pragma.

Consider:

const checkedFsKey = Symbol();
F[checkedFsKey] = new Set<F>;
F[Symbol.hasInstance] = function (x: object) {
  const hasInstance = // whatever
  if (hasInstance) {
    F[checkedFsKey].add(x);
  }
  return hasInstance;
};
mindplay-dk commented 6 years ago

@aluanhaddad good point, yes - introducing a block doesn't make sense.

hsir commented 6 years ago

The syntax can be something like


function incoming(thing: array | number, another: boolean | string): void {
  // thing is array or number
  if (checkSomething()) {
    guard thing as array, another as string {
      // thing is array, another is string
      console.log(thing.length, another.toLowerCase())
    }
  } else {
    guard thing as number {
      // thing is number
      console.log(thing.toFixed())
    }
  }
}
rzvc commented 6 years ago

I like the original proposal's syntax, because if you need to narrow the scope of the type assertion, you can just add a block and make the assertion in there.

Regarding the alteration of the type of already declared variables in the current scope, with let and cost, the compiler can just throw an error complaining about multiple declarations, which makes total sense.

In my view, this type of type assertion should be nothing more than a type declaration, with the condition that the variable already exists in a higher scope.

Also, I don't agree with the use of declare keyword here. It should be something new or something unrelated, so their meanings and use don't conflict with each other.

dead-claudia commented 6 years ago

@rzvc I personally would prefer both of these forms:

saschanaz commented 6 years ago

haven't found anything that isn't ugly or looks like an expression with side effects.

declare typeof foo: Foo? Not introducing a new keyword and literally (re)declaring the type of foo as Foo. And as TypeScript users we all know declare won't affect emitted JavaScript.

lemoinem commented 5 years ago

@saschanaz If we're going down this road, I'd prefer declare typeof foo = Foo; because : usually means "the thing on the right is the type of the thing on the left". But that wouldn't be the case here (Foo is the type of foo, not the type of typeof foo).

saschanaz commented 5 years ago

If we're going down this road, I'd prefer declare typeof foo = Foo;

That would also be good, but I'm slightly afraid that some people would confuse with typeof foo === "string".

lemoinem commented 5 years ago

Fair enough... Then declare foo as Foo ? unambiguous, no new keyword, no confusion possible with as-expression, declare means no impact to JS code...

dead-claudia commented 5 years ago

@lemoinem That's basically my proposal, just using declare (which doesn't make as much sense - you're not declaring anything) instead of a contextual assume.

yahiko00 commented 5 years ago

Indeed, declare could be misleading.

saschanaz commented 5 years ago

which doesn't make as much sense - you're not declaring anything

It does declare that this variable has a different type in this block, no?

yahiko00 commented 5 years ago

In most of the programming languages, and in TypeScript as well, the usual semantic of a declaration is creating a new symbol in the lexical scope.

In this discussion, if I am corect, the point is not to create a new symbol in the lexical scope (which needs to be done before), but to narrow the type of an already declared variable.

saschanaz commented 5 years ago

Well, yes... that's why I said "(re)declaration". Anyway, C specification says:

A "declaration" specifies the interpretation and attributes of a set of identifiers

I think we can still say that the suggested type narrowing specifies the new interpretation of the given identifier for the current block.

dead-claudia commented 5 years ago

@saschanaz But in TS, like JS, you generally can't declare something twice (declare var and declare function are exceptions to this rule). You can't declare a block-scoped variable twice, for example:

// This is valid
declare const foo: number
declare const bar: number

// This is not
declare const foo: number
declare const foo: number

Weirdly, I stumbled into a very odd type checking bug in the process of verifying this.

saschanaz commented 5 years ago

Maybe duplicated type redeclarations should also throw:

if (x.nodeType === 3) {
  declare typeof x: Text;
  declare typeof x: Element; // !!
}

BTW, my example looks weird to myself because declare has never been allowed in a block.

dead-claudia commented 5 years ago

@saschanaz

Maybe duplicated type redeclarations should also throw:

I'm not sure it should:

declare typeof x: Node;
if (x.nodeType !== Node.ELEMENT_NODE) return;
declare typeof x: Element;

BTW, my example looks weird to myself because declare has never been allowed in a block.

Yeah, I don't really like that syntax too well myself. Compare that to this:

// @lemoinem's suggestion
declare typeof x: Node;
if (x.nodeType !== Node.ELEMENT_NODE) return;
declare typeof x: Element;

// Mine
assume x is Node;
if (x.nodeType !== Node.ELEMENT_NODE) return;
assume x is Element;
saschanaz commented 5 years ago

The problem with assume is that TS team has resisted against introducing a new keyword unless there is an excellent and perfect reason (as in the case of keyof). 😢

dead-claudia commented 5 years ago

@saschanaz I personally feel assume hits that high bar, though, about as much as keyof did.

But of course, I'm not the arbiter of this, so I'll dive back into the sidelines now.

yahiko00 commented 5 years ago

Although I also suggested assumein my original proposal, I am wondering if a keyword like narrow could be even more explicit, in case of course a new keyword would be introduced.

saschanaz commented 5 years ago

I think the team said no for assume X is Y, per Ryan's comment:

We've wanted a syntactic space for this for a while but haven't found anything that isn't ugly or looks like an expression with side effects.

hax commented 5 years ago

What about these alternatives?

assert X is Y
use X as Y
refine X as Y
narrow X as Y

We could also only allow narrowing in if/else block:

if (...): X is Y {
   ...
} else: X is Z {
   ...
}
dead-claudia commented 5 years ago

@hax I don't believe any of those would be popular among the TS design team.

hax commented 5 years ago

@isiahmeadows Have no idea why all of these alternatives are "ugly or looks like an expression with side effects". 😫

dead-claudia commented 5 years ago

I think "expression or statement" was implied in "expression with side effects".

rozzzly commented 5 years ago
let foo: bar | baz;

// typeof foo is bar | baz
if (something) { 
    foo is bar;
    // typeof foo is bar
}
// typeof foo is bar | baz

block level assertions pls. Just reuse is from currently implemented type guards, no new syntax, not tied to conditional statements like suggestions above.

// @ts foo is bar would work too if the side effects thing is too much for everybody but wouldn't be as easy for the language server to provide completions/highlighting/etc

Personally, I find the gymnastics I have to go through some times to be more mentally taxing than realizing that foo is bar is not a valid ecmascript but rather typescript syntax for a compile time-type assertion.

pm-nsimic commented 5 years ago

how about adding the typeguard after the if condition itself by re-using the existing typeguard syntax:

if ("width" in obj): obj is AABox {
    // compiler assumes the obj is `AABox` within this scope
}
dead-claudia commented 5 years ago

@pm-nsimic I like that. It'd also make most existing type narrowing situations just special case sugar:

// Equality
if (a === b) { ... }
if (a === b): a is typeof b { ... } // Desugared
if (a === b): b is typeof a { ... } // Desugared

// Inequality
if (a !== b) { ... }
if (a !== b): a is Exclude<typeof a, typeof b> { ... } // Desugared
if (a !== b): b is Exclude<typeof b, typeof a> { ... } // Desugared

// `== null`
if (a == null) { ... }
if (a == null): a is null | undefined { ... } // Desugared

// `!= null`
if (a != null) { ... }
if (a != null): a is Exclude<typeof a, null | undefined> { ... } // Desugared

// Falsy
if (a) { ... }
if (a): a is "" | 0 | false | null | undefined { ... } // Desugared

// Truthy
if (a) { ... }
if (a): a is Exclude<typeof a, "" | 0 | false | null | undefined> { ... } // Desugared

// `typeof` guard
if (typeof a === "string") { ... }
if (typeof a === "string"): a is string { ... } // Desugared

// `instanceof` guard
if (a instanceof A) { ... }
if (a instanceof A): a is A { ... } // Desugared if `A` has type-level binding
shicks commented 5 years ago

Redirecting here from #8655, I'm not fond of the idea of requiring an if block. I understand that there's architectural constraints that requires a syntactic signal for whether to perform narrowing, but requiring an if does not handle the case of assertion functions.

Closure Library and Compiler have had very good success with debug-only assertions, such as

assert(a instanceof A);

The compiler understands this as a type assertion and narrows the type of a through the rest of the control flow. TS can handle this right now with

if (!(a instanceof A)) throw new Error('Assertion failed');

but there's no good way for an optimizer to remove this later. If we had some way to annotate a function call as a type-narrowing assertion that could show up in the AST to resolve the performance issues, then the function call could be retained and possibly removed in post-processing. Note that I'm not asking for TypeScript to get into the optimization business, but the current design is a particular impediment to working well with third-party optimizers, which seems somewhat in line with goal 3 ("Impose no runtime overhead on emitted programs" - but the current solution does) and in line with the counterpoint to non-goal 2 ("Instead, emit idiomatic JavaScript code that plays well with the performance characteristics of runtime platforms").

Unfortunately, I'm not thinking of any particularly good syntax to annotate this. It does remind me a bit of the expression-level syntax for non-null type assertions, but I don't see a good way to extend that at all.

dead-claudia commented 5 years ago

@shicks Frustratingly, I don't see any way of allowing that without making it statement-like. Really, the only way you could get a generic block scope type assertion is via a statement-like construction or a let alias = value as Type and using alias instead of value, which is still a statement.

dead-claudia commented 5 years ago

Note that with the let alias = value as Type, UglifyJS and Terser oddly do not remove the alias, despite it almost always being safe to. Edit: Terser does when it's in the same scope, but when it's in a child function scope, it doesn't.