microsoft / TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
https://www.typescriptlang.org
Apache License 2.0
101.31k stars 12.53k forks source link

Proposal for generators design #2873

Closed JsonFreeman closed 9 years ago

JsonFreeman commented 9 years ago

A generator is a syntactic way to declare a function that can yield. Yielding will give a value to the caller of the next() method of the generator, and will suspend execution at the yield point. A generator also supports yield * which means that it will delegate to another generator and yield the results that the inner generator yields. yield and yield * are also bi-directional. A value can flow in as well as out.

Like an iterator, the thing returned by the next method has a done property and a value property. Yielding sets done to false, and returning sets done to true.

A generator is also iterable. You can iterate over the yielded values of the generator, using for-of, spread or array destructuring. However, only yielded values come out when you use a generator in this way. Returned values are never exposed. As a result, this proposal only considers the value type of next() when the done property is false, since those are the ones that will normally be observed.

Basic support for generators

Type annotation on a generator

A generator function can have a return type annotation, just like a function. The annotation represents the type of the generator returned by the function. Here is an example:

function *g(): Iterable<string> {
    for (var i = 0; i < 100; i++) {
        yield ""; // string is assignable to string
    }
    yield * otherStringGenerator(); // otherStringGenerator must be iterable and element type assignable to string
}

Here are the rules:

A generator function with no type annotation can have the type annotation inferred. So in the following case, the type will be inferred from the yield statements:

function *g() {
    for (var i = 0; i < 100; i++) {
        yield ""; // infer string
    }
    yield * otherStringGenerator(); // infer element type of otherStringGenerator
}

Since the Iterable type will be used a lot, it is a good opportunity to add a syntactic form for iterable types. We will use T* to mean Iterable<T>, much the same as T[] is Array<T>. It does not do anything special, it's just a shorthand. It will have the same grammatical precedence as [].

Question: Should it be an error to use * type if you are compiling below ES6.

The good things about this design is that it is super easy to create an iterable by declaring a generator function. And it is super easy to consume it like you would any other type of iterable.

function *g(limit) {
    for (var i = 0; i < limit; i++) {
        yield i;
    }
}

for (let i of g(100)) {
    console.log(i);
}
var array = [...g(50)];
var [first, second, ...rest] = g(100);

Drawbacks of this basic design

  1. The type returned by a call to next is not always correct if the generator has a return expression.
function *g() {
    yield 0;
    return "";
}
var instance = g();
var x = instance.next().value; // x is number, correct
var x2 = instance.next().value; // x2 is given type number, but it's actually a string!

This implies that maybe we should give an error when return expressions are not assignable to the element type. Though if we do, there is no way out.

  1. The types of yield and yield * expressions are just any. Many users will not care about these, but the type of the yield expression is useful if for example, you are implementing await on top of yield.
  2. If you type your generator with the * type, it does not allow someone to call next directly on the generator. Instead they must cast the generator or get the iterator from the generator.
function *g(): number* {
    yield 0;
}
var gen = g();
gen.next(); // Error, but allowed in ES6 (preferred in fact)
(<IterableIterator<number>>gen).next(); // works, but really ugly
gen[Symbol.iterator]().next(); // works, but pretty ugly as well

To clarify, issue 3 is not an issue for for-of, spread, and destructuring. It is only an issue for direct calls to next. The good thing is that you can get around this by either leaving off the type annotation from the generator, or by typing it as an IterableIterator.

Advanced additions to proposal

To help alleviate issue 2, we can introduce a nominal Generator type (already in es6.d.ts today). It is an interface, but the compiler would have a special understanding of its type arguments. It would look something like this:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield /*| TReturn*/> {
    next(n: TNext): IteratorResult<TYield /*|TReturn*/>;
    // throw and return methods elided
}

Notice that TReturn is not used in the type, but it will have special meaning if you are using something that is nominally a Generator. Use of the Generator type annotation is purely optional. The reason that we need to omit TReturn in the next method is so that Generator can be assignable to IterableIterator<TYield>. Note that this means issue 1 still remains.

function *g(): Generator<number, any, string> {
   var x = yield 0; // x has type string
}
function *g() {
    yield 0;
    return ""; // Error or infer TReturn as string
}

Once we have TReturn in place, the following rules are added:

function *g1(): Generator<any, any, string> {
    var t = yield * g2(); // Error that string is not assignable to number
}
function *g2(): Generator<any, any, number> {
    var s = yield 0;
}

Ok, now for issue 1, the incorrectness of next. There is no great way to do this. But one idea, courtesy of @CyrusNajmabadi, is to use TReturn in the body of the Generator interface, so that it looks like this:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
    next(n: TNext): IteratorResult<TYield | TReturn>;
    // throw and return methods elided
}

As it is, Generator will not be assignable to IterableIterator<TYield>. To make it assignable, we would change assignability so that every time we assign Generator<TYield, TReturn, TNext> to something, assignability changes this to Generator<TYield, any, TNext> for the purposes of the assignment. This is very easy to do in the compiler.

When we do this, we get the following result:

function *g() {
    yield 0;
    return "";
}
var g1 = g();
var x1 = g1.next().value; // number | string (was number with old typing)
var x2 = g1.next().value; // number | string (was number with old typing, and should be string)

var g2: Iterator<number> = g(); // Assignment is allowed by special rule!
var x3 = g2.next(); // number, correct
var x4 = g2.next(); // number, should be string

So you lose the correctness of next when you subsume the generator into an iterable/iterator. But you at least get general correctness when you are using it raw, as a generator.

Additionally, operators like for-of, spread, and destructuring would just get TYield, and would be unaffected by this addition, including if they are done on a Generator.

Thank you to everyone who helped come up with these ideas.

JsonFreeman commented 9 years ago

I've updated the proposal with the results of further discussion. There have only been a few minor changes:

DanielRosenwasser commented 9 years ago

For the sake of completeness, I think it would extremely helpful to actually state the current declarations of the types named here:

interface IteratorResult<T> {
    done: boolean;
    value?: T;
}

interface Iterator<T> {
    next(value?: any): IteratorResult<T>;
    return?(value?: any): IteratorResult<T>;
    throw?(e?: any): IteratorResult<T>;
}

interface Iterable<T> {
    [Symbol.iterator](): Iterator<T>;
}

interface IterableIterator<T> extends Iterator<T> {
    [Symbol.iterator](): IterableIterator<T>;
}

interface GeneratorFunction extends Function {
}

interface GeneratorFunctionConstructor {
    /**
      * Creates a new Generator function.
      * @param args A list of arguments the function accepts.
      */
    new (...args: string[]): GeneratorFunction;
    (...args: string[]): GeneratorFunction;
    prototype: GeneratorFunction;
}
declare var GeneratorFunction: GeneratorFunctionConstructor;

interface Generator<T> extends IterableIterator<T> {
    next(value?: any): IteratorResult<T>;
    throw(exception: any): IteratorResult<T>;
    return(value: T): IteratorResult<T>;
    [Symbol.iterator](): Generator<T>;
    [Symbol.toStringTag]: string;
}
CyrusNajmabadi commented 9 years ago

Looks good!

Griffork commented 9 years ago

Got a little lost in the first post, but I'm going to write what I understood, and you guys can correct me if I'm wrong:

function *g () {
    var result: TNext = yield <TYield>mything()
}

Is there anything important that I missed here?


Request: A nicer way of defining generator types e.g. for a generator,

function* g(value: number) {
    while (true) {
        value+= yield value;
    }
}

something like:

var ginst: GeneratorInstance<number, number>

and

var gtype: *g(start: number)=>GeneratorInstance<number, number>;

For the following code:

ginst = g(0);
ginst.next(2);
gtype = g;

:+1: for generators

edit: fixed putting *'s in all the wrong places.

Griffork commented 9 years ago

... Also the lack of a return statement annoys me, I think it should be forced to have the same type as yield, and if it's a different type (and yield is being implicitly typed) the return type should force a change to the implicitly derived type for yield. To summarise; In a generator function return is treated identically to yield.

This way I can have my generators actually end on a value that's not forced to be undefined (by Typescript).

DanielRosenwasser commented 9 years ago

@Griffork from what I understand, you can have return statements, just not return expressions - specifically, you can't return a value, but you can bail out from within the generator at any point.

This probably doesn't help your frustration in the return type being ignore; however, it would certainly help to get some use realistic cases for what exactly you'd like to return when a generator has terminated.

Griffork commented 9 years ago

@DanielRosenwasser not sure I understand. I guess what you're calling a return expression is: return true;? If that is the case, then how is a return statement different to a return expression?


Here's an example of the type of generator I was thinking of when I voiced my discomfort:

function* g (case) {
    while(true){
        switch(case) {
            case "dowork1":
                //do stuff
                case = yield "OPERATIONAL - OK";
                break;
            case "dowork2":
                //do stuff
                case = yield "OPERATIONAL - OK";
                break;
           case "shutdown":
               //do stuff
               return "COMPLETE";
        }
    }
}

Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.

My concern (which I have not yet researched) is that without the return statement, there might be garbage-collection problems on some systems (particularly since the whole function-state has to be suspended and resumed on a yield), which is bad if you're spawning a lot of similarly-structured generators/iterators.

It also makes the function read a lot more clearly in my opinion.

DanielRosenwasser commented 9 years ago

I guess what you're calling a return expression is: return true;?

That is a return statement, for which the return expression is true.

In other words, a return expression is the expression being returned in a return statement.

Where it may execute an arbitrary amount of times, but at some point it's "completed" and it notify's it's caller that it's done.

From what I understand of your example, you return "COMPLETE" to indicate that the generator is done, which I don't see as any more useful as the done property on the iterator result. We need some more compelling examples.

DanielRosenwasser commented 9 years ago

Though, now that I think about it, if there are multiple ways to terminate (i.e. shutdown or failure), that's when the returned value in a state-machine-style generator would be useful.

Griffork commented 9 years ago

@DanielRosenwasser got it, thanks for the clarification :).

yortus commented 9 years ago

I'd argue that a correct implementation would allow return expressions, and type them distictly from yield expressions.

Generators are commonly used in asynchronous task runners, such as co. Here is an example:

var co = require('co');
var Promise = require('bluebird');

// Return a promise that resolves to `result` after `delay` milliseconds
function asyncOp(delay, result) {
    return new Promise(function (resolve) {
        setTimeout(function () { resolve(result); }, delay);
    });
}

// Run a task asynchronously
co(function* () {
    var a = yield asyncOp(500, 'A');
    var ab = yield asyncOp(500, a + 'B');
    var abc = yield asyncOp(500, ab + 'C');
    return abc;
})
.then (console.log)
.catch (console.log);

The above program prints 'ABC' after a 1.5 second pause.

The yield expressions are all promises. The task runner awaits the result of each yielded promise and resumes the generator with the resolved value.

The return expression is used by the task runner to resolve the promise associated with the task itself.

In this use case, yield and return expressions are (a) equally essential, and (b) have unrelated types that ideally would be kept separate. In the example, TYield is Promise<string> and TReturn is string. There is no reason why they would be conflated into one type in a task runner.

Griffork commented 9 years ago

@yortus I'm not sure what you're asking for is at all possible, or if it makes any sense, I'll try to explain where I'm confused.

The only way to start or resume a generator is the generator's .next function. This function takes a single argument (which is supplied in place of the yield expression) and returns a single value (which is the value to the right of the yield expression).

The following Javascript:

function*g() {
    var a = yield "a";
    var b = yield a + "b";
    var c = yield b + "bc";
    return 0;
}
var ginst = g();

console.log(g.next() + g.next("a") + g.next("a"));
return g.next("");

Is the equivalent to

console.log(("a") + ("a" + "b") + ("a" + "bc"));
return 0;

But what happens if I try:

var done = false;
var value;
while (!done) {
    value = ginst.next(value);
    console.log(value);
}

I get:

"a"
"ab"
"abbc"
0

The last one is a number, meaning if ginst.next is to be called in a loop, the return type must be string|number or it may be incorrect.


It's important to note here that the proposal that yield and return are treated identically will work for co's consumption, and for Promises. If it will help I can write some example implementations.

jbondc commented 9 years ago

Like that last suggestion:

interface Generator<TYield, TReturn, TNext> extends IterableIterator<TYield> {
    next(n: TNext): IteratorResult<TYield | TReturn>;
    // throw and return methods elided
}

Seems ok to lose the correctness of next when you subsume the generator into an iterable/iterator.

Does that solve drawback #3? Not sure if I like T*, this looks clearer:

function *g(): Generator<number, string, any>  {
    yield 0;
    return "";
}
var a: Iterable<number|string> = g();

// lose correctness
var b: Iterable<number> = g();

// consider using *T for better symmetry instead of T*
var c: *number = g();
Griffork commented 9 years ago

@jbondc *T has better symmetry, but can be confusing because *T doesn't denote a generator here, it denotes an iterable, which while that can be the same thing can also not be the same thing.

jbondc commented 9 years ago

If you read * as 'many values' from thing, it works well for generators and iterators. Likely T* bothers me because it looks like a pointer if you write string*

yortus commented 9 years ago

Another example of using generators to support asynchronous control flow. This is working code, runnable in current io.js. There are some comments showing the runtime types of TYield and TReturn. When generators are used in this way, these types tend to be unrelated to each other. The most useful type to have inferred in this example is probably the TReturn type.

var co = require('co');
var Promise = require('bluebird');
var fs = Promise.promisifyAll(require('fs'));
var path = require('path');

// bulkStat: (dirpath: string) => Promise<{ [filepath: string]: fs.Stats; }>
var bulkStat = co.wrap(function* (dirpath) {

    // filenames: string[], TYield = Promise<string[]>
    var filenames = yield fs.readdirAsync(dirpath);
    var filepaths = filenames.map(function (filename) {
        return path.join(dirpath, filename);
    });

    // stats: Array<fs.Stats>, TYield = Array<Promise<fs.Stats>>
    var stats = yield filepaths.map(function (filepath) {
        return fs.statAsync(filepath);
    });

    // result: { [filepath: string]: fs.Stats; }
    var result = filepaths.reduce(function (result, filepath, i) {
        result[filepath] = stats[i];
        return result;
    }, {});

    // TReturn = { [filepath: string]: fs.Stats; }
    return result;
});

bulkStat(__dirname)
    .then(function (stats) {
        console.log(`This file is ${stats[__filename].size} bytes long.`);
    })
    .catch(console.log);

// console output:
// This file is 1097 bytes long.

The function bulkStat stats all the files in the specified directory and returns a promise of an object that maps file paths to their stats.

Note that the TReturn type is unrelated to either of the TYield types, and the two TYield types are unrelated to each other.

JsonFreeman commented 9 years ago

@Griffork

@yortus I agree that the primary case for passing in a value to next is async frameworks, since you want to pass the value the awaited Promise was resolved with. And I see your point about the return value being used to signify the fulfilled value of the Promise being created. I suppose the limitation of the basic proposal is that while it is great at typing generators as an implementation of an iterable, it does not give strong treatment to using generators as async state machines. Suppose we relaxed the restriction on return values, and the type system just ignored them. Would that be acceptable? We would allow everything that is required to write your async state machines, but there would be a lot of any types floating around. Presumably this is a pretty advanced use case.

Without dependent types, it becomes very hard to hold onto TReturn without having it pollute TYield. Ideally, we would have one type associated with done: false and another with done: true. But without that facility, there is really no good place to represent TYield and TReturn separately in the type structure.

@jbondc, I understand your syntactic concern with * looking like a pointer. But I have to agree with @Griffork that *T will be more confusing, because it seems to be intimately tied to generators. And in fact, this type needn't be used with generators. It is just sugar for an Iterable.

Griffork commented 9 years ago

Replying in phone, bear with me...

@JsonFreeman oh, good point. I stopped monitoring the straw man before for... of was finalised. The use case that I currently have for return is the state machine example above when you consider that you can also return "ERROR". On another note, does done = true on error?

Yes, I plan to do some funcy promise-like stuff with a next-able state based generator. And I like your suggestion that generators that take a value should error in a for-of.


Being able to detect type depending on the value of done sounds good, but I'm not sure how possible that is, as it would be easy to break. The only way I can see @yortis' example working is if he explicitly passed typing information to co and co used that to type the return function. Either way I don't think it's possible for Typescript to provide what you're asking for, unless someone can give me a working example of how it would be implemented.

Griffork commented 9 years ago

@JsonFreeman would it be possible to opt in/out of returning a value?

I don't know where your facts about the typical usage of generators comes from, an article like that would be useful to read, would I be able to get a link? From what you're saying it sounds like most users are liable to use both yield and return to return values from their generator but they don't want to know about the value returned by return. Or are you trying to say that most users don't use return (I imagine if you're not using return in the generator, it's not going to pollute the yielded value).

JsonFreeman commented 9 years ago

@Griffork

JsonFreeman commented 9 years ago

It would essentially involve hacking the assignability rules to make sure a generator that returns something is assignable to an iterable when you ignore that return value. Doable, but kind of a hack.

Griffork commented 9 years ago

Oh, ok. @JsonFreeman when I was first looking up generators, the amount of threads/blogs/posts I found that wanted to use it in a promise fashion vs an iterable was about 10:1. That's why I was asking you for your source. I don't think that the idea that most users will want to use it as an iterable is valid, although it will still be very prevalent, using the generator for promises looks like it will be about equally prevalent if not more.

I see what you mean about the problems with making generators sometimes not iterable. If it's going to be a hack, either don't do it or don't do it yet, leave it to the user and if it's a big problem later you can revaluate the decision.

As for opting in/out of returning a value, yes. When I first wrote that I was thinking of something else, but that idea was bad and this one is better.

Again, I don't think you can separate the return type from the yield type due to the way generators are used (although I agree it would be useful, JavaScript's implementation does not make this doable).

yortus commented 9 years ago

@Griffork here is an in-depth article describing many uses and details of generators. TL;DR: the two main uses cases so far are (1) implementing iterables and (2) blocking on asynchronous function calls.

@JsonFreeman having TReturn = any always would be a good start. Not allowing return expressions at all would rule out many valid uses of generators. You describe the async framework scenario as 'advanced'. Perhaps so, but in nodeland with its many async APIs, it's already a widespread idiom that works today and is growing in popularity. co has a lot or github stars, a lot of dependents, and a lot of variants.

Interestingly, when crafting generators to pass to co, one cares more about the TResult type and the types returned by yield expressions, whilst the TYield type is not so important.

Side note: the proposal for async functions (#1664) mentions using generators in the transform for representing async functions in ES6 targets. Return expressions are needed there, in fact the proposal shows one in its example code. It would be funny if tsc emitted generators with return expressions as its 'idiomatic ES6' for async functions, but rejected them as invalid on the input side.

yortus commented 9 years ago

@JsonFreeman #2936 mentions singleton types are getting the green light. At least for string literal types. If there was also a boolean literal type, then the next function could return something like { done: false; value?: TYield; } | { done: true; value?: TReturn; }. Then type guards could distinguish the two cases.

I'm just thinking out loud here, so not sure if that would make anything easier, even it if did exist.

JsonFreeman commented 9 years ago

@Griffork and @yortus, thank you for your points. It sounds like we are leaning towards the solution of the "next" parameter and the return values having type any, but allowing generator authors to return a value. The return type of next will take into account TYield but not TReturn. Would you agree that that solution is a good way to start?

@yortus, as for singleton types, let's see how it goes for strings, and then we can evaluate it for booleans. At that point it would be clearer whether it would help split up TYield and TReturn, but I imagine that it could be just what we need here.

Griffork commented 9 years ago

@JsonFreeman sure, I'd be happy with that. At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).

Thank you for listening, this has been one of the most enjoyable discussions I've had on a Typescript issue :-).

yortus commented 9 years ago

@JsonFreeman sounds good.

Another minor point:

interface Generator<T> extends IterableIterator<T> {
    next(value?: any): IteratorResult<T>;
    throw(exception: any): IteratorResult<T>;
    return(value: T): IteratorResult<T>;   // <--- value should not be constrained to T
    [Symbol.iterator](): Generator<T>;
    [Symbol.toStringTag]: string;
}

That's copied from above. Shouldn't the return method be return(value?: any): IteratorResult<T>;? Calling this method causes the generator to resume and immediately execute return value;. There is no link between the type of value and the T type which is the type of the yield expressions in the generator.

JsonFreeman commented 9 years ago

Great, thanks guys!

@yortus, I am actually not sure there is much value in defining the Generator type yet. I'd sooner remove it now, and add it back later if we want to leverage it to support the return value and next value.

But to your point about the return method, yeah I think you're right. I guess you could also define it as

return<U>(value: U): IteratorResult<T | U>;

Meaning it would return something of the yield type, or the thing you passed in. The yield type would only be returned in pathological cases like this:

function* g() {
    try {
        yield 0; // suspended here, and user calls return("hello");
    }
    finally {
        yield 1; // return gets intercepted by this yield expression
    }
}

But I realize that this is a ridiculous reason to include T in there.

danquirk commented 9 years ago

@Griffork At least then there will be the opportunity to gather feedback from Typescript users instead of relying on speculation (particularly my own).

This (feedback driven changes) is definitely our preferred methodology but keep in mind the problem is we can relax a restriction later without breaking people but cannot do the reverse. So defaulting to typing something as any while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach. This is not to say we're just always defaulting to the most conservative option in the face of any uncertainty but it is definitely a large factor when considering which side to come down on when we want to give ourselves room to change/adapt in the future.

Griffork commented 9 years ago

@danquirk I understand your concern.

My standing comes from the fact that there are already libraries that require that the generator's return function works. And I am planning to design a system that requires that the generator's return function is available, if it is not available my planned library cannot work (not even with a yield replacement).

So yes - I understand that you guys don't want to commit to something that in the future you won't be able to work with, but you must understand that users of TypeScript will require this functionality, and that it may not be as small of a percentage as you may think.

It is almost tempting to return to vanilla Javascript just for the generator support, however the large project that I'm embarking on will suffer from it in the long run.

yortus commented 9 years ago

defaulting to typing something as any while permissive also means if we realize it's wrong later (whether based on our own exploration or feedback from others) then the change is much more painful than had we taken the more conservative approach.

@danquirk that's a valid point and should perhaps rule out the TResult = any approach.

However if the current proposal to disallow return expressions stands, that will also be pretty painful for people thinking TypeScript supports ES6 generators and reaching for their favourite async control flow library. Perhaps in this case, the proposal/feature should be renamed on the 1.6 roadmap to better qualify it - something like 'Iterable Generators' or 'partial generator support'. As @Griffork points out, async control flow is a fairly major use-case of generators in current ES6 code out there.

As an alternative to TReturn = any, what would happen if the first stage proposal was to accurately infer both the TYield and TReturn types, and accept for now the inconveniences associated with next() returning { done: boolean; value?: TYield | TResult }. (NB: I'm assuming this is feasible in the compiler, but don't know enough about it to judge).

This has inconveniences for iterators, but at least it would be correct type-wise and therefore avoid the future-proofing problem @danquirk mentions. The inconveniences could be addressed with syntax or compiler sugar at a later point. But at least generators would be full ES6 generators.

JsonFreeman commented 9 years ago

To clarify one thing. I didn't actually mean that we would infer any from the return expressions. I just meant that we would ignore them instead of giving an error.

Inferring TYield | TReturn for the value means that neither of our two use cases (iteration and async) are pleasant or ideal. Until we can model the done property correctly as a literal type, I'd rather make one use case pleasant, and the other possible. This may sound short-signed, but I think we can largely avoid breaks later, if we make the stronger type change opt-in. Does that make sense?

yortus commented 9 years ago

I think we can largely avoid breaks later, if we make the stronger type change opt-in.

@JsonFreeman would you mind clarifying what this means in practice?

JsonFreeman commented 9 years ago

Sure. My statement presupposes that we will at some point have boolean literal types. The idea is that right now, for a generator, we will infer the return type to be IterableIterator<TYield>. The return type of next will be

{
    done: boolean;
    value: TYield; // not TReturn
}

Then later, let's say we have the opportunity to switch to boolean literal types. At this point, we would not change our inference. We would continue to infer IterableIterator<TYield>. But the user would have the opportunity to change their return type annotation to a stronger type that does use boolean literals. So we might provide a type Generator<TYield, TReturn>, whose next method returns

{ done: false; value: TYield } | { done: true; value: TReturn }

The one caveat is that I'm not sure this would be assignable to IterableIterator<TYield>. But we don't even need it to be. Because if you are typing your generator this way, you are willing to give up its iterability. How does that sound?

Griffork commented 9 years ago

If you can make it work it sounds good. I did not think

{ done: false; value: TYield } | { done: true; value: TReturn }

Was stricter/assignable to

{
    done: boolean;
    value: TYield; 
}

Or will the former be only with return and the latter be only without?

JsonFreeman commented 9 years ago

It is not assignable. The idea was that by default, we would infer the latter as the type (and ignore return values), but we'd allow you to specify the former.

However, there is a way to make it assignable if we alter the type of an Iterator so that it's .next method returns

{ done: false; value: TYield } | { done: true; value: any }

Now the former type from your comment is assignable to this one. The consequence is that calling next directly on an Iterator<string> will give you any, but if you can establish that done is false, you will get string. And all the syntactic forms that consume iterators will assume done is false.

Griffork commented 9 years ago

Oh, ok. I was more thinking that people who get used to TYield being the only return type from next() may be in for a nasty surprise when it changes.

JsonFreeman commented 9 years ago

Right, I would not want to change it on them unless they change their type annotation.

Griffork commented 9 years ago

So people who want to use return on a generator would need to type the generator to do so and wouldn't be able to get it implicitly?

JsonFreeman commented 9 years ago

If they want the type of their return expressions to be tracked, yes. Otherwise, the type system will just ignore it.

Griffork commented 9 years ago

Well, I suspect those who want to use promises aren't going to want to cast every generator they write (and neither am I for my library).

If that is the case I guess I'll be looking into other languages that support generators better.

I probably won't ever use the generator as an iterator.

JsonFreeman commented 9 years ago

You wouldn't have to cast it, you would just have to supply a return type annotation on your generator when you defined it.

Would it be better to accept a breaking change later, and change the inference behavior with respect to return expressions? Namely, have them be part of the inference, once we have boolean literal types?

Griffork commented 9 years ago

That's your call, not mine, but since I seem to be repeatedly misunderstanding you, can you provide an example of a generator with that return type, and example usage of it using next?

JsonFreeman commented 9 years ago

Ok, let's suppose that we did not change the inference behavior upon the introduction of boolean literal types. Then you could only reasonably use the following generator as an iterator:

function* g() {
    yield 0;
    return "completed";
}
for (let x in g()) {
    // here x has type number, as it should
}
var inst = g();
while (true) {
    let next = g.next();
    if (next.done) {
          let vReturn = next.value; // number, but should be string
    }
    else {
          let vYield = next.value; // number, as expected
    }
}

Now with the type annotation

function* g(): Generator<number, string, any> { // Note the new type annotation
    yield 0;
    return "completed";
}
for (let x in g()) {
    // x still has type number
}
var inst = g();
while (true) {
    let next = g.next();
    if (next.done) {
          let vReturn = next.value; // string, now correct
    }
    else {
          let vYield = next.value; // number, as expected
    }
}

So here are the options:

  1. Infer IterableIterator now, and don't change that inference later (when we have boolean literals). But at that time, allow the user to supply a type annotation that makes the typing more precise. Now, everything you'd want to do is possible, but may require a type annotation (not a cast) to work correctly.
  2. Infer IterableIterator now, and change inference to infer the more precise type later. This means that at that point, there will be no effort on the user to make generators work in the way that you are asking for. But it would break consumers who are using your generator as an iterator, and now suddenly can't.
  3. Do nothing now, and wait until we have boolean literal types to implement generators, so that we can have the correct typing right off the bat. This means we have no breaks, but we delay generators until literal types are done.
  4. Make generators that have return expressions not iterable by the yield type. Essentially this means that instead of using TYield, we would just union TYield and TReturn so that it would not be pleasant to iterate over a generator that has a return expression.
  5. Do what I suggested in my advanced additions to the proposal. This means that we do some hackery in the type system to make Generator<TYield, TReturn> assignable to Iterable<TYield> (they would be pleasant to iterate over), but the Generator type would be somewhat of an oddball in the type system, and the compiler would pay special attention to the type arguments of the Generator type. This is a hack in the type system, but it essentially produces all the semantics that we want up front without requiring boolean literal types. We could later replace this with boolean literal types if/when they come online.
JsonFreeman commented 9 years ago

Option 5 is something I'm certainly willing to try out if you are interested in seeing what this would look like.

yortus commented 9 years ago

How will option 5 help the async use case? Even if there is a Generator<TYield, TReturn> type, the TReturn type won't appear in any of its members (until boolean literal types come along). That is, next() will still return { done: boolean; value: TYield; } for the time being.

Has option 3 been given serious consideration? It seems the only way to expose the TReturn type to consumers of the Generator<TYield, TReturn> interface. Does anyone on the team know how hard it would be to get boolean literal types into the compiler, so generators could be implemented fully with no hackery and no picking winners (ie out of iteration and async)?

JsonFreeman commented 9 years ago

Sorry if I wasn't clear on option 5. I meant that next would actually return { done: boolean; value: TYield | TReturn }. So calling next on something of type Generator would give you the right thing, but we'd still have separate access to the two types if you are using the nominal type Generator.

Option 3 has not been seriously considered yet, but maybe worth more discussion. It is also possible to go with option 5 temporarily until boolean literal types come along, at which point we'd be able to remove the hack introduced by option 5.

One more option: We could introduce a minimal form of boolean literal types early, without exposing many of the features of literal types, but use them as a way to track done-ness of the generator. This would allow us to keep the types separate. But I hesitate to suggest this because I'm not sure what the implications of adding the full literal type feature will be, given certain assumptions we might make about this initial implementation.

yortus commented 9 years ago

From the proposal above:

The element type is the common supertype of all the yield operands and the element types of all the yield * operands. It is an error if there is no common supertype.

Isn't this also going to break the async use case? I gave a working example above where there are two yield expressions have types Promise<string[]> and Array<Promise<fs.Stats>>.

The current proposal would make this an error if I'm not mistaken. But its perfectly valid and normal in the async use case for yield expressions to have no common supertype.

What if TYield was the union type of the yield expression types? Wouldn't that work out of the box for both use cases (iteration and async)? I suppose in the iteration case, it just wouldn't catch some programmer errors (ie, if they yield two different types in the same generator).

JsonFreeman commented 9 years ago

Yes, you are correct. It would break. I can change that so it uses the union type. This was more just for parity with the return expressions in a normal function, but as you point out, there is a meaningful difference here.

Griffork commented 9 years ago

@yortis for 5 next() would return { done: boolean; value: TYield|TReturn; } initially and would be updated when boolean literals become a thing, but for-of would only return TYield.

3,4 or 5 sound good. Honestly. As much as I want to use generators now, I'd rather wait for proper support than to rush them and seriously gimp them.

4 seems like something that would be good to do some research for, as it seems like it could be useful even if 3 or 5 are chosen. In fact, 3,4 and 5 are not mutually exclusive and should all be carefully considered (since if you wait for 3, 5 would still be good for having a more correct type for for-of).