Open amikai opened 4 months ago
The part I find more troublesome when implementing is that letting the user decide when to stop the loop means developer have to have to implement the complex mechanism to close the channel at receiver side.
I think we also need to consider how to use it with the context
. However, I have no idea how to do that at the moment.
Hello @amikai! I've been toying with ideas in this direction on my end as well, so I'm glad to see others think this is an exciting possibility. Thank you for writing up this proposal.
When users break the loop, the running goroutines should be canceled to avoid wasting resources.
This one is a quite tricky since it requires conc
to know how to cancel running goroutines. The standard way to do this is to have a cancellable context, but that means that conc
must either be given a cancel()
func, or it must own the context that child goroutines respect.
One way around it is just to leave cancellation to the caller. I personally like this because although it is a little harder to use, it's more predictable IMO and leaves control fully in the hands of the user.
ids := slices.Values([]int{1,2,3,4,5,6})
func fetchSquare(ctx context.Context, id int) {
...
}
ctx, cancel := context.WithCancel(ctx)
for v := range iter.SeqMap(ints, func(x int) int { return fetchSquare(ctx, x) }) {
if v == 25 {
cancel()
break
}
fmt.Println(v)
}
letting the user decide when to stop the loop means developer have to have to implement the complex mechanism to close the channel at receiver side.
I think we can design this so that the consumer doesn't have to worry about it. Because of the design of the iter package, control will be yielded back to conc
when the user breaks the loop (the yield
func returns false), which gives us the opportunity to clean up goroutines and propagate panics.
This is a good idea, but I want to ask if it might cause a goroutine leak if the user does not call cancel and the for loop has already been broken.
Or in another situation, the context is already canceled but there is no break in for loop.
Perhaps we can pass the context by changing the method to iter.SeqMapCtx(ctx, xxx)
, or by creating a struct with WithContext
like Pool.WithContext(ctx)
. This way, users have the choice to cancel themselves, but after a break, iter.SeqMap
will definitely cancel (we can derive a cancellable context during implementation).
This will make it easier for users because they won’t need to worry about forgetting to cancel it. But I think this will cause problems for developers, including increased implementation difficulty and how to handle panic situations.
// function declaration
func SeqMapCtx[In, Out any](context.Context, iter.Seq[In], func(context.Context, In) Out ) iter.Seq[Out] {...}
ids := slices.Values([]int{1,2,3,4,5,6})
func fetchSquare(ctx context.Context, id int) int {
...
}
ctx := context.Background()
// SeqMapCtx will derived the context with cancel
for v := range iter.SeqMapCtx(ctx, fetchSquare}) {
if v == 25 {
break // when yield return false, cancel the context and clean up goroutines
}
fmt.Println(v)
}
This is my opinion. I hope it helps.
Pool
and ContextPool
. If user choose passing context way, then conc
help it to cancel task and clean gorotines after break.NOTE: I haven't PoC yet. I'm not sure if it can be implemented.
Proposal
I propose the four functions
SeqMap
under conciter
:SeqMap2
under conciter
: similar toiter.SeqMap
but returns two values, allowing it to be used for error handling.SeqMap
under concstream
: similar toiter.SeqMap
, but in order way.SeqMap2
under concconc
: similar tostream.SeqMap
but returns two values, allowing it to be used for error handling.Some thought
I believe the best aspects of a concurrent iterator are its ability to calculate lazily and its resource-saving capability. Functions like
iter.Map
anditer.ForEach
will exhaust all elements, but with an iterator, when the output iterator stops, the input iterator should also cease retrieving elements and stop all goroutines.