Why there is prefetching in ReadUntilCancelledAsync? - Githubissues

Open-NET-Libraries / Open.ChannelExtensions

A set of extensions for optimizing/simplifying System.Threading.Channels usage.

https://open-net-libraries.github.io/Open.ChannelExtensions/api/Open.ChannelExtensions.Extensions.html#methods

MIT License

401 stars 25 forks source link

Why there is prefetching in ReadUntilCancelledAsync? #18

Closed azhmur closed 2 years ago

azhmur commented 2 years ago

Current implementation of ReadUntilCancelledAsync which is widely used for implementing other methods like (PipeAsync etc) has some prefetching mechanism. It acquires next item before waiting for the current item to be completed. In cases than items aren't similiar in terms of processing complexity this ends up in blocking at the end of channel. There are free processors, but they do nothing as one of processor have two items active and just taken.

Current code:

        do
        {
            var next = new ValueTask();
            while (
                !cancellationToken.IsCancellationRequested
                && reader.TryRead(out T? item))
            {
                await next.ConfigureAwait(false);
                next = receiver(item, index++);
            }
            await next.ConfigureAwait(false);
        }
        while (
            !cancellationToken.IsCancellationRequested
            && await reader.WaitToReadAsync(cancellationToken).ConfigureAwait(false));

I had simplified it to:

        do
        {
            while (
                !cancellationToken.IsCancellationRequested
                && reader.TryRead(out T? item))
            {
                await receiver(item, index++).ConfigureAwait(false);
            }
        }
        while (
            !cancellationToken.IsCancellationRequested
            && await reader.WaitToReadAsync(cancellationToken).ConfigureAwait(false));

Which solves tail blocking, and all tests are running perfectly as well. But I wonder if there are cases which this prefetching was designed to solve? (because it had introduced intentional complexity in the code)

electricessence commented 2 years ago

I picked up this technique from chatting Ben Adams on a project I published. It's an optimization that he had presented to me. It's not something you would consider always doing, but in this case the "pre-fetch" seemed a logical step as you are more likely to saturate threads this way.

If you don't think this is appropriate, I could introduce a parameter to allow for disabling it. At the moment, I can't think of any reason you wouldn't want the optimization.

electricessence commented 2 years ago

ends up in blocking at the end of channel

Not sure what you're suggesting here. If you have a bounded channel, and it reaches it's configured limit of (lets say) 100. Then you really have 101. In other cases, where you are either unbound or the channel is not full, you're simply preemptively acquiring an item.

electricessence commented 2 years ago

All that said, I can see where this micro-optimization may not apply to Channels the same way it might in other situations. It's meant for cases where you expect some amount of potential async delay and .TryRead there really is none.

electricessence commented 2 years ago

I think you'll notice that I do this for writing to a channel as well.

electricessence commented 2 years ago

I'm quite willing to change this. Would you look at the write version of this and tell me what you think there? https://github.com/Open-NET-Libraries/Open.ChannelExtensions/blob/fcba7caeacb8c902eb40aefd186d24093c395525/Open.ChannelExtensions/Extensions.Write.cs#L41-L51 I think in the case of writing from an enumerable, the enumerable can be yielding longer than it takes for the next opening in the channel to occur. So there is some value there. But I think you're right about the read operation. I can't foresee how it can benefit.

electricessence commented 2 years ago

This has been corrected as per your suggestion. https://www.nuget.org/packages/Open.ChannelExtensions/6.0.1

https://github.com/Open-NET-Libraries/Open.ChannelExtensions/blob/master/Open.ChannelExtensions/Extensions.Read.cs#L100-L105

electricessence commented 2 years ago

@azhmur Thank you for pointing this out. :)

azhmur commented 2 years ago

Thank you for fast and reasonable response. Great library, I wish this would be part of System.IO.Pipelines.

electricessence commented 2 years ago

I wish this would be part of System.IO.Pipelines.

@azhmur Yeah, there are a large number of extensions I wish were simply built into the ecosystem. Check out my database extensions library. https://github.com/Open-NET-Libraries/Open.Database.Extensions (there's even a package for channels) In either case, a large number of basic operations could be streamlined. I couldn't just sit by and not have best practices built in.

electricessence commented 2 years ago

Thank you for fast and reasonable response.

@azhmur I love that someone took the time to scrutinize the code and post an issue. As I mentioned, I do think there is some amount of reasonable optimization for that pattern under specific cases, but this was not the case. I hadn't thought about it so deeply before. Thanks again.