ziglang / zig

General-purpose programming language and toolchain for maintaining robust, optimal, and reusable software.
https://ziglang.org
MIT License
33.62k stars 2.46k forks source link

Async function yield statements #5611

Open adrusi opened 4 years ago

adrusi commented 4 years ago

Here's my implementation of generators in Zig 0.6.0. As far as I can tell there's no elegant way in current Zig to eliminate the intermediate copy of all values yielded by the generator. There's also a less important issue where the ritual for yielding a value is a bit ugly:

item.* = VALUE;
suspend;

I'm going to propose something that should fix both problems, and which I think is the most minimal change to make generators a proper feature. The idea is to add a pointer parameter to the calling convention for (some?) async functions, one which gets written to by yield VALUE statements. Some pseudocode:

fn foo() void yield u8 {
    var n: u8 = 0;
    while (true) {
        yield n;
        n += 1;
    }
}

fn bar() {
    var item: u8;
    var frame = async(&item) foo();
    warn("{}\n", .{ item });
    resume frame;
    warn("{}\n", .{ item });
    var item2: u8;
    @setYieldLocation(frame, &item2);
    resume frame;
    warn("{}\n", .{ item2 });
}

In this illustrative syntax, the parameter to async sets the pointer that yield writes to. The yield statement writes to that pointer and suspends. @setYieldLocation allows the caller to change the pointer after the initial creation of the frame. Once #2765 is resolved, this should allow a userland wrapper to implement a next() function without requiring an intermediate copy.

This adds quite a bit of surface area to the language. A minimal alternative that requires no language changes besides #2765 is for the userland implementation of next() to communicate the yield pointer to the generator code explicitly. This would require changing the yield ritual from

item.* = VALUE;
suspend;

to

item.*.* = VALUE;
suspend;

I consider this to be a bit too cumbersome, and it breaks with the ideal of making the right way to code the easiest way as well, since dropping the double-pointer still works, as long as you tolerate the intermediate copy.

ikskuh commented 4 years ago

Just for the record: You can do an inversion of control and put the pointer to the caller instead of the generator: https://godbolt.org/z/MPZA7J

And now… 30 Minutes later i have this: https://godbolt.org/z/1QXhSL

Looks convenient to me:

fn generator(out_result: **u32, done: *bool) void
{
    var result : u32 = 0;
    out_result.* = &result;

    var i : usize = 0;
    while(i < 3) : (i += 1) {
        yield();
        result += 1;
    }

    done.* = true;
}

pub fn main() void
{
    var gen = Generator(generator).init();

    std.debug.warn("result[0] = {}\n", .{ gen.next() });
    std.debug.warn("result[1] = {}\n", .{ gen.next() });
    std.debug.warn("result[2] = {}\n", .{ gen.next() });
    std.debug.warn("result[3] = {}\n", .{ gen.next() });
    std.debug.warn("result[4] = {}\n", .{ gen.next() });
}
fengb commented 4 years ago

https://github.com/ziglang/zig/pull/4059 my best attempt ended in failure. We need more than just a "yield" statement if we want to make generators work with async functions.

adrusi commented 4 years ago

Async generators are admittedly more important to get working in Zig than in e.g. Rust, which also currently lacks that functionality. Unlike in Rust, in Zig it's possible to silently make your function async, so having generators that can't handle calls to async functions is a massive footgun.

I've been working on that problem. I think I have a solution that will work, although I'm currently trying to work around a zig compiler bug. Current incarnation of my solution

Assuming that some evolution of that ends up working, then yes, yield statements are not the only addition required to make generators as efficient and footgun-free as possible, but they are one of two main language changes that would be necessary.

The other is (again, assuming that my solution is on the right track) a way to await a frame without capturing its return value, and ideally also a more friendly way of setting the return value location for an async call.

saltzm commented 3 years ago

I'm not sure what the etiquette is around making counter-proposals for implementations, but I was playing around with implementing a generator this weekend with a friend and came up with this implementation which I thought was relatively simple and easy to use. A yield keyword in zig would definitely help make this more elegant though.

Here's a usage sample from the tests (full code in the link):

fn squares(generator: *Yielder(i64)) void {
    var n: i64 = 1;
    while (true) {
        generator.yield(n * n);
        n += 1;
    }
}

test "generate squares" {
    var generator = Generator(i64, squares){};
    var i: i64 = 1;
    while (i < 100000) : (i += 1) {
        expect(generator.next() == i * i);
    }
}

fn fibonacci(generator: *Yielder(i64)) void {
    var a: i64 = 0;
    var b: i64 = 1;
    while (true) {
        generator.yield(b);
        const next = a + b;
        a = b;
        b = next;
    }
}

test "generate fibonacci" {
    var generator = Generator(i64, fibonacci){};

    expect(generator.next() == @intCast(i64, 1));
    expect(generator.next() == @intCast(i64, 1));
    expect(generator.next() == @intCast(i64, 2));
    expect(generator.next() == @intCast(i64, 3));
    expect(generator.next() == @intCast(i64, 5));
    expect(generator.next() == @intCast(i64, 8));
    expect(generator.next() == @intCast(i64, 13));
}
saltzm commented 3 years ago

I think I made them work now, even with a suspend point that's supposed to be resumed by something other than the Generator, (e.g. the event loop): Link

I then rephrased it as an "UnbufferedChannel" but you could easily rename it to Generator and call its methods yield/next instead of send and receive, and I think it's even simpler: Link

EDIT: updated link for 0.9.0

fn fibonacci_w_channel(channel: *UnbufferedChannel(i64)) void {
    var a: i64 = 0;
    var b: i64 = 1;
    while (true) {
        channel.send(b);
        const next = a + b;
        a = b;
        b = next;
    }
}

fn test_fibonacci_w_channel(finished_test: *bool) void {
    var channel = UnbufferedChannel(i64){};
    var fibonacci_generator = async fibonacci_w_channel(&channel);
    expect(channel.recv() == @intCast(i64, 1));
    expect(channel.recv() == @intCast(i64, 1));
    expect(channel.recv() == @intCast(i64, 2));
    expect(channel.recv() == @intCast(i64, 3));
    expect(channel.recv() == @intCast(i64, 5));
    expect(channel.recv() == @intCast(i64, 8));
    expect(channel.recv() == @intCast(i64, 13));
    finished_test.* = true;
}