golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.2k stars 17.57k forks source link

proposal: chans: new package with chans.Seq #67205

Open earthboundkid opened 4 months ago

earthboundkid commented 4 months ago

Proposal Details

61899 would add iteration related functions to slices, and #61900 does the same for maps. There should also be a package with channel iteration helpers. I propose adding package chans with chans.Seq:

package chans

import "iter"

// Seq returns an iterator that yields the values of ch until it is closed.
func Seq[T any](ch <-chan T) iter.Seq[T] {
    return func(yield func(T) bool) {
        for v := range ch {
            if !yield(v) {
                return
            }
        }
    }
}

An example use would be s := slices.Collect(chans.Seq(ch)) as a convenient way to collect all values from a channel into a slice.

DeedleFake commented 4 months ago

I have a package that implements a bunch of iterator-related stuff so that I can play around with it. To supplement the simpler OfChan() function that it provides that does this, I also added a RecvContext() function that can use a context to cancel the iterator, as well as a SendContext() function that does the same in reverse, sending sequence values to a channel until a context is canceled. I'd like to add RecvContext(), under whatever name, for consideration, too.

earthboundkid commented 4 months ago

chans.Seq seems definitely necessary. I was thinking we might also want the reverse, which would be like SendContext minus the context part. Putting it together, maybe it should be these four functions to start with:

func Seq[T any](<-chan T) iter.Seq[T]
func SeqContext[T any](context.Context, <-chan T) iter.Seq[T]
func FromSeq[T any](iter.Seq[T]) <-chan T
func FromSeqContext[T any](context.Context, iter.Seq[T]) <-chan T 

There was also talk about having helpers like FanIn and FanOut back when generics were still experimental, but those can be a separate proposal.

DeedleFake commented 4 months ago

FromSeq() is wrong. It should be

func FromSeq[T any](iter.Seq[T], chan<- T)

rather than returning a channel.

earthboundkid commented 4 months ago

That makes sense. You can reuse an existing channel and you get better type inference.

earthboundkid commented 4 months ago

Other names for chans.FromSeq[Context] might be chans.Send or chans.Copy.

apparentlymart commented 4 months ago

For the two variants that take a context.Context:

I'm accustomed to seeing functions that take a context return an error because an operation being cancelled (or, equivalently, hitting a deadline) is usually modeled as a kind of failure. These functions don't do that, and of course they cannot because the current iterator design only supports infallible iterators.

I can imagine having the caller use ctx.Err() once the iterator stops yielding to see if the iteration might have been terminated early due to cancellation or deadline. It presumably can't know for sure, because the cancellation might have happened between the last call to the iterator and the ctx.Err() call. That edge case probably doesn't matter in practical programs.

s := slices.Collect(chans.SeqContext(ctx, ch))
if ctx.Err() != nil {
    // s may not be complete, then
}

Overall this reminds me of the bufio.Scanner design where users are expected to call Err on the scanner after Scan returns false, to distinguish between a successful full scan vs. early termination due to an error. I remember in one of the earlier iterator proposals some participants worrying that this design was easy to accidentally use incorrectly. That concern would presumably apply to this situation too.

Overall I don't feel super worried about either of these details myself. They certainly don't seem disqualifying, and I raise them largely just for the sake of discussion.

(The non-context versions seem uncontroversial to me; they are pretty obvious implications of the iterators design.)

earthboundkid commented 4 months ago

You could have func SeqContext(context.Context, <-chan T) iter.Seq2[T, error] return the ctx.Err() as the second iterator parameter.

DeedleFake commented 4 months ago

func FromSeqContext(context.Context, iter.Seq[T], chan<- T) error is possible, too.

ianlancetaylor commented 4 months ago

Just a note that the main reason I haven't proposed a chans package is exactly the question of how to handle contexts.

bjorndm commented 4 months ago

How about treating contexts like log/slog did? That would be most consistent and easy to teach.

AndrewHarrisSPU commented 4 months ago

How about treating contexts like log/slog did? That would be most consistent and easy to teach.

Where slog evolved to omit contexts, I'm not sure the lessons analogize well with iterators; it contrasts with other places the standard library includes contexts.

slog explored a number of approaches with contexts and eventually decided to maximally push the complexity out. Contexts are in the Handler interface, but are never observed in slog's implementations - neither for control flow nor as value stores. The driving questions concerned contexts as value stores, rather than contexts as cancellation in control flows.

Where contexts are used elsewhere in the standard library tends to be domain-specific: net dialing, database/sql queries, http requests, os/exec commands, trace regions and tasks, etc. In each of these cases, there is a pretty useful notion of cancellation that arises in solutions for the domain.

There's undoubtedly a useful notion of cancellation here with chans; also I wonder how often we'd have a more domain-specific iterator, e.g. over the results of a database query, that doesn't already have context cancellation baked in.

earthboundkid commented 4 months ago

To get around having a dependency on context, there could be variants that take <-chan struct{} as a cancel channel. It could also be made generic against <-chan struct{} and <-chan time.Time, so it could accept a time.Timer.C.

earthboundkid commented 4 months ago

Playground

func Seq[T, Done any](ch <-chan T, done <-chan Done) iter.Seq[T] {
    return func(yield func(T) bool) {
        for {
            select {
            case v, ok := <-ch:
                if !ok {
                    return
                }
                if !yield(v) {
                    return
                }

            case <-done:
                return
            }
        }
    }
}

func Count() <-chan int { /*...*/ }

func main() {
    for n := range Seq(Count(), time.After(1*time.Microsecond)) {
        fmt.Println(n)
    }
}

Interestingly, this works in a real program but it times out on Playground, presumably because Playground time is an illusion.

dshearer commented 1 week ago

You could have func SeqContext(context.Context, <-chan T) iter.Seq2[T, error] return the ctx.Err() as the second iterator parameter.

This is my favorite proposal. What are the arguments against it?