dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.26k stars 4.73k forks source link

ReadAsync(Memory<char> buffer, CancellationToken cancellationToken = default(CancellationToken)) of StreamReader causes AsyncStateMachineBox allocations #99264

Open PatVax opened 8 months ago

PatVax commented 8 months ago

Description

While reading from StreamReader asynchronously with the Memory ReadAsync overload, for each, or at least most async calls, state machines for ReadAsyncInternal and ReadBufferAsync seem to be boxed causing considerable allocation overhead. If instead a FileStream with the same overload of ReadAsync is called only allocations are of the FileStream itself and the needed buffers. Allocations were found with dotMemory profiler by JetBrains. Quick test with other ReadAsync overloads yielded same results. For ReadLineAsync the problematic allocations where of course orders of magnitute smaller then the string allocations. Boxing allocations cause a 6-7x increase in allocated managed heap memory than with FileStream in my example. image

Reproduction Steps

To reproduce try reading any text file or use following to create one for testing:

await using (StreamWriter writer = new("test.txt"))
    for (int i = 0; i < 10_000; i++)
        writer.WriteLine(string.Concat(Enumerable.Repeat(i.ToString()[0], 100)));

Following functions and members are declared:

async ValueTask WithStreamReader()
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
    using (StreamReader stream = new("test.txt",
               options: new FileStreamOptions
               {
                   Access = FileAccess.Read,
                   Mode = FileMode.Open,
                   Share = FileShare.Read,
                   Options = FileOptions.Asynchronous | FileOptions.SequentialScan
               }))
    {
        var inputBuffer = ArrayPool<char>.Shared.Rent(100);
        while (true)
        {
            var mem = inputBuffer.AsMemory();
            int readCount;
            if ((readCount = await stream.ReadAsync(mem)) == 0) break;
            Console.WriteLine(inputBuffer, 0, readCount);
        }

        ArrayPool<char>.Shared.Return(inputBuffer);
    }

    GC.Collect();
    GC.WaitForPendingFinalizers();
}

async ValueTask WithFileStream()
{
    GC.Collect();
    GC.WaitForPendingFinalizers();
    await using (FileStream stream = File.Open("test.txt",
                     options: new FileStreamOptions
                     {
                         Access = FileAccess.Read,
                         Mode = FileMode.Open,
                         Share = FileShare.Read,
                         Options = FileOptions.Asynchronous | FileOptions.SequentialScan
                     }))
    {
        var inputBuffer = ArrayPool<byte>.Shared.Rent(100);
        var outputBuffer = ArrayPool<char>.Shared.Rent(100);
        while (true)
        {
            var mem = inputBuffer.AsMemory();
            int readCount;
            if ((readCount = await stream.ReadAsync(mem)) == 0) break;
            Helper.Decoder.GetChars(mem.Span, outputBuffer.AsSpan(), false);
            Console.WriteLine(outputBuffer, 0, readCount);
        }

        ArrayPool<byte>.Shared.Return(inputBuffer);
        ArrayPool<char>.Shared.Return(outputBuffer);
    }

    GC.Collect();
    GC.WaitForPendingFinalizers();
}

public static class Helper
{
    public static readonly Decoder Decoder = Encoding.UTF8.GetDecoder();
}

With the above the following code executes this test:

await using (StreamWriter writer = new("test.txt"))
    for (int i = 0; i < 10_000; i++)
        writer.WriteLine(string.Concat(Enumerable.Repeat(i.ToString()[0], 100)));
for (int i = 0; i < 10; i++) await WithStreamReader();
for (int i = 0; i < 10; i++) await WithFileStream();
File.Delete("test.txt");
return;

Expected behavior

WithStreamReader() runs without internally boxing async state machines.

Actual behavior

WithStreamReader() allocates async state machine boxes internally in ReadAsync method of StreamReader.

Regression?

No response

Known Workarounds

Use FileStream and handle various encodings manually.

Configuration

.NET 8.0.201 Windows 10 21H2 (OS Build 19044.4046) x64 I wasn't able to profile for x86

Other information

No response

ghost commented 8 months ago

Tagging subscribers to this area: @dotnet/area-system-io See info in area-owners.md if you want to be subscribed.

Issue Details
### Description While reading from StreamReader asynchronously with the Memory ReadAsync overload, for each, or at least most async calls, state machines for ReadAsyncInternal and ReadBufferAsync seem to be boxed causing considerable allocation overhead. If instead a FileStream with the same overload of ReadAsync is called only allocations are of the FileStream itself and the needed buffers. Allocations were found with dotMemory profiler by JetBrains. Quick test with other ReadAsync overloads yielded same results. For ReadLineAsync the problematic allocations where of course orders of magnitute smaller then the string allocations. Boxing allocations cause a 6-7x increase in allocated managed heap memory than with FileStream in my example. ![image](https://github.com/dotnet/runtime/assets/33271292/ff80ffbc-7619-441d-9440-f3d71d34611b) ### Reproduction Steps To reproduce try reading any text file or use following to create one for testing: ```csharp await using (StreamWriter writer = new("test.txt")) for (int i = 0; i < 10_000; i++) writer.WriteLine(string.Concat(Enumerable.Repeat(i.ToString()[0], 100))); ``` Following functions and members are declared: ```csharp async ValueTask WithStreamReader() { GC.Collect(); GC.WaitForPendingFinalizers(); using (StreamReader stream = new("test.txt", options: new FileStreamOptions { Access = FileAccess.Read, Mode = FileMode.Open, Share = FileShare.Read, Options = FileOptions.Asynchronous | FileOptions.SequentialScan })) { var inputBuffer = ArrayPool.Shared.Rent(100); while (true) { var mem = inputBuffer.AsMemory(); int readCount; if ((readCount = await stream.ReadAsync(mem)) == 0) break; Console.WriteLine(inputBuffer, 0, readCount); } ArrayPool.Shared.Return(inputBuffer); } GC.Collect(); GC.WaitForPendingFinalizers(); } async ValueTask WithFileStream() { GC.Collect(); GC.WaitForPendingFinalizers(); await using (FileStream stream = File.Open("test.txt", options: new FileStreamOptions { Access = FileAccess.Read, Mode = FileMode.Open, Share = FileShare.Read, Options = FileOptions.Asynchronous | FileOptions.SequentialScan })) { var inputBuffer = ArrayPool.Shared.Rent(100); var outputBuffer = ArrayPool.Shared.Rent(100); while (true) { var mem = inputBuffer.AsMemory(); int readCount; if ((readCount = await stream.ReadAsync(mem)) == 0) break; Helper.Decoder.GetChars(mem.Span, outputBuffer.AsSpan(), false); Console.WriteLine(outputBuffer, 0, readCount); } ArrayPool.Shared.Return(inputBuffer); ArrayPool.Shared.Return(outputBuffer); } GC.Collect(); GC.WaitForPendingFinalizers(); } public static class Helper { public static readonly Decoder Decoder = Encoding.UTF8.GetDecoder(); } ``` With the above the following code executes this test: ```csharp await using (StreamWriter writer = new("test.txt")) for (int i = 0; i < 10_000; i++) writer.WriteLine(string.Concat(Enumerable.Repeat(i.ToString()[0], 100))); for (int i = 0; i < 10; i++) await WithStreamReader(); for (int i = 0; i < 10; i++) await WithFileStream(); File.Delete("test.txt"); return; ``` ### Expected behavior WithStreamReader() runs without internally boxing async state machines. ### Actual behavior WithStreamReader() allocates async state machine boxes internally in ReadAsync method of StreamReader. ### Regression? _No response_ ### Known Workarounds Use FileStream and handle various encodings manually. ### Configuration .NET 8.0.201 Windows 10 21H2 (OS Build 19044.4046) x64 I wasn't able to profile for x86 ### Other information _No response_
Author: PatVax
Assignees: -
Labels: `area-System.IO`, `untriaged`
Milestone: -
stephentoub commented 8 months ago

StreamReader.ReadAsync does additional work on top of whatever FileStream.ReadAsync. In particular, whereas the latter just reads bytes directly into the caller's buffer, the former reads bytes and then translates those bytes into chars to be written to the caller's buffer. That means StreamReader.ReadAsync needs to do work after it reads from the underlying stream, regardless of what the underlying API does, and that means it's using an async method.

PatVax commented 8 months ago

Does the boxing operation happen when GetAwaiter() is called on the inner state machine? If so why doesn't it happen when I use FileStream itself in my own code and use the results of it in my own async method?

stephentoub commented 8 months ago

Does the boxing operation happen when GetAwaiter() is called on the inner state machine?

A state machine for an async method gets lifted to the heap the first time an async method awaits something that's not completed. Lots of details are here: https://devblogs.microsoft.com/dotnet/how-async-await-really-works/

If so why doesn't it happen when I use FileStream itself in my own code and use the results of it in my own async method?

It's not implemented using an async method. It's pooling a custom IValueTaskSource that's backing the ValueTask being returned from the ReadAsync method. That accrues to StreamReader as well, but as mentioned in addition to that operation StreamReader.ReadAsync is then doing additional work after that operation completes.

PatVax commented 8 months ago

I actually already had a read of this particular blog once before. I guess I didn't connect what was written there with what I observed today. I guess there isn't really any better way of doing this using async/await internally. I mean it is still better than it was on .NET Framework. I think this issue may be closed then. I don't think there is any room for improvement connected to this. Unless there is a better way of asynchronously reading text than what I posted above.

PatVax commented 8 months ago

What about [AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder))]? Can it somehow be added to an existing method in .NET? It could be interesting to see how ReadAsync would behave in this case.

stephentoub commented 8 months ago

It could, but it's not necessarily a win. Do you have evidence of this being a performance problem beyond seeing the allocations show up in a profiler when the I/O completes asynchronously? You're welcome to try it and come back with perf numbers on various scenarios, but we use the attribute sparingly. As called out in https://devblogs.microsoft.com/dotnet/performance-improvements-in-net-6/, "So, from a performance perspective, it’s best to use this capability only in places where it’s both likely to matter and where performance testing demonstrates it moves the needle in the right direction. We can see, of course, that there are scenarios where in addition to saving on allocation, it actually does improve throughput, which at the end of the day is typically what one is really focusing on improving when they’re measuring allocation reduction (i.e. reducing allocation to reduce time spent in garbage collection).".

PatVax commented 8 months ago

Not necessarily evidence but I am making an application which will be used to test hardware which will be communicated with via the LIN protocol. It sends messages periodically, so assuming a value that will be send next is to be updated shortly before and coincidentally about at this time a GC collection occurs, the updated value might be sent after 50-500ms later than it could. This wouldn't happen in the end environment since the controller would be an embedded program. Hence tests made using my software would be (albeit very slightly) less precise. Therefor I am trying to minimize GC pressure wherever it seems possible. In the case of file I/O it is not as much of an issue since no tests would be running while reading a file and I can manually GC collect after I read or disposed of a file. Still before I settle on GC collecting after necessarily allocating operations I am trying to find a way to reduce or even get rid of any allocations(or rather collections) altogether.

PatVax commented 8 months ago

I have made a copy of the StreamReader where I added [AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder<>))] to the definitions of ReadAsyncInternal and ReadBufferAsync. I was forced to make some changes to the original implementation due to some types and members used by StreamReader being internal. Mostly Exception factories, but in case of ReadBlockAsync I had to call base.ReadBlockAsync instead of ReadBlockAsyncInternal. It overrides the implementation of TextReader so it might be problematic. Still ReadAsync produces correct results for simple sequential reads. Following is the diff of the definition of AmortizedStreamReader:

8       using System.Runtime.CompilerServices;
8
18          public class AmortizedStreamReader : TextReader
18          public class StreamReader : TextReader
21              public static new readonly AmortizedStreamReader Null = new NullAmortizedStreamReader();
21              public static new readonly StreamReader Null = new NullStreamReader();
72              private readonly bool _closable; // Whether to close the underlying stream.
72              private readonly bool _closable;  // Whether to close the underlying stream.
91                  throw new InvalidOperationException("The stream is currently in use by a previous operation on the stream.");
91                  throw new InvalidOperationException(SR.InvalidOperation_AsyncIOInProgress);
98              private AmortizedStreamReader()
98              private StreamReader()
100                 Debug.Assert(this is NullAmortizedStreamReader);
100                 Debug.Assert(this is NullStreamReader);
105             public AmortizedStreamReader(Stream stream)
105             public StreamReader(Stream stream)
110             public AmortizedStreamReader(Stream stream, bool detectEncodingFromByteOrderMarks)
110             public StreamReader(Stream stream, bool detectEncodingFromByteOrderMarks)
115             public AmortizedStreamReader(Stream stream, Encoding encoding)
115             public StreamReader(Stream stream, Encoding encoding)
120             public AmortizedStreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks)
120             public StreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks)
135             public AmortizedStreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
135             public StreamReader(Stream stream, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
140             public AmortizedStreamReader(Stream stream, Encoding? encoding = null, bool detectEncodingFromByteOrderMarks = true, int bufferSize = -1, bool leaveOpen = false)
140             public StreamReader(Stream stream, Encoding? encoding = null, bool detectEncodingFromByteOrderMarks = true, int bufferSize = -1, bool leaveOpen = false)
144                     throw new ArgumentNullException("stream");
144                     ThrowHelper.ThrowArgumentNullException(ExceptionArgument.stream);
149                     throw new ArgumentException("Stream was not readable.");
149                     throw new ArgumentException(SR.Argument_StreamNotReadable);
181             public AmortizedStreamReader(string path)
181             public StreamReader(string path)
186             public AmortizedStreamReader(string path, bool detectEncodingFromByteOrderMarks)
186             public StreamReader(string path, bool detectEncodingFromByteOrderMarks)
191             public AmortizedStreamReader(string path, Encoding encoding)
191             public StreamReader(string path, Encoding encoding)
196             public AmortizedStreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks)
196             public StreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks)
201             public AmortizedStreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
201             public StreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks, int bufferSize)
206             public AmortizedStreamReader(string path, FileStreamOptions options)
206             public StreamReader(string path, FileStreamOptions options)
211             public AmortizedStreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks, FileStreamOptions options)
211             public StreamReader(string path, Encoding encoding, bool detectEncodingFromByteOrderMarks, FileStreamOptions options)
223                     throw new ArgumentException("Stream was not readable.", nameof(options));
223                     throw new ArgumentException(SR.Argument_StreamNotReadable, nameof(options));
357                     throw new ArgumentException("Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source coll
357                     throw new ArgumentException(SR.Argument_InvalidOffLen);
364                 GetType() == typeof(AmortizedStreamReader) ? ReadSpan(buffer) :
364                 GetType() == typeof(StreamReader) ? ReadSpan(buffer) :
436                     throw new ArgumentException("Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source coll
436                     throw new ArgumentException(SR.Argument_InvalidOffLen);
446                 if (GetType() != typeof(AmortizedStreamReader))
446                 if (GetType() != typeof(StreamReader))
885             /// <see cref="AmortizedStreamReader"/>) or returned (to the caller) may be lost.
885             /// <see cref="StreamReader"/>) or returned (to the caller) may be lost.
893                 if (GetType() != typeof(AmortizedStreamReader))
893                 if (GetType() != typeof(StreamReader))
1013            /// <see cref="AmortizedStreamReader"/>) or returned (to the caller) may be lost.
1013            /// <see cref="StreamReader"/>) or returned (to the caller) may be lost.
1021                if (GetType() != typeof(AmortizedStreamReader))
1021                if (GetType() != typeof(StreamReader))
1058                    throw new ArgumentException("Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source coll
1058                    throw new ArgumentException(SR.Argument_InvalidOffLen);
1065                if (GetType() != typeof(AmortizedStreamReader))
1065                if (GetType() != typeof(StreamReader))
1081                if (GetType() != typeof(AmortizedStreamReader))
1081                if (GetType() != typeof(StreamReader))
1098            [AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder<>))]
1098
1099            internal virtual async ValueTask<int> ReadAsyncInternal(Memory<char> buffer, CancellationToken cancellationToken)
1099            internal override async ValueTask<int> ReadAsyncInternal(Memory<char> buffer, CancellationToken cancellationToken)
1270                    throw new ArgumentException("Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source coll
1270                    throw new ArgumentException(SR.Argument_InvalidOffLen);
1277                if (GetType() != typeof(AmortizedStreamReader))
1277                if (GetType() != typeof(StreamReader))
1293                if (GetType() != typeof(AmortizedStreamReader))
1293                if (GetType() != typeof(StreamReader))
1308                ValueTask<int> vt = base.ReadBlockAsync(buffer, cancellationToken);
1308                ValueTask<int> vt = ReadBlockAsyncInternal(buffer, cancellationToken);
1319            [AsyncMethodBuilder(typeof(PoolingAsyncValueTaskMethodBuilder<>))]
1319
1410                void ThrowObjectDisposedException() => throw new ObjectDisposedException(GetType().Name, "Cannot read from a closed TextReader.");
1410                void ThrowObjectDisposedException() => throw new ObjectDisposedException(GetType().Name, SR.ObjectDisposed_ReaderClosed);
1415            internal sealed class NullAmortizedStreamReader : AmortizedStreamReader
1415            internal sealed class NullStreamReader : StreamReader

A Benchmark with BenchmarkDotNet produced following results:


BenchmarkDotNet v0.13.12, Windows 10 (10.0.19044.4046/21H2/November2021Update)
AMD Ryzen 5 5600X, 1 CPU, 12 logical and 6 physical cores
.NET SDK 8.0.201
  [Host]     : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2
  Job-KJEBDR : .NET 8.0.2 (8.0.224.6711), X64 RyuJIT AVX2

Force=True  InvocationCount=1  UnrollFactor=1  
Method Parameters Mean Error StdDev Median Ratio RatioSD Allocated Alloc Ratio
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 100 } 211.6 μs 4.97 μs 14.11 μs 209.3 μs 1.00 0.00 9.74 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 100 } 220.9 μs 5.58 μs 16.18 μs 220.9 μs 1.05 0.10 9.76 KB 1.00
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 1000 } 223.6 μs 4.46 μs 12.42 μs 222.4 μs 1.00 0.00 10.07 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 1000 } 216.6 μs 4.69 μs 13.45 μs 215.3 μs 0.97 0.08 9.43 KB 0.94
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 10000 } 262.8 μs 6.41 μs 17.87 μs 260.4 μs 1.00 0.00 11.01 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 10000 } 266.5 μs 6.46 μs 18.10 μs 265.6 μs 1.02 0.10 9.27 KB 0.84
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 100000 } 558.9 μs 11.17 μs 31.87 μs 553.9 μs 1.00 0.00 17.88 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 100000 } 627.3 μs 13.15 μs 38.16 μs 622.3 μs 1.13 0.09 9.48 KB 0.53
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 1000000 } 3,397.1 μs 105.09 μs 308.21 μs 3,407.8 μs 1.00 0.00 86.32 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 1000000 } 3,899.8 μs 170.64 μs 500.46 μs 4,041.6 μs 1.16 0.18 9.76 KB 0.11
WithStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 10000000 } 23,453.1 μs 336.11 μs 262.41 μs 23,511.4 μs 1.00 0.00 773.38 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = 10000000 } 23,019.7 μs 443.77 μs 415.11 μs 22,815.4 μs 0.98 0.02 9.76 KB 0.01
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 100 } 206.8 μs 4.10 μs 10.58 μs 206.8 μs 1.00 0.00 3.95 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 100 } 206.5 μs 4.12 μs 8.22 μs 205.5 μs 0.99 0.06 3.63 KB 0.92
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 1000 } 240.6 μs 4.81 μs 11.80 μs 242.2 μs 1.00 0.00 4.43 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 1000 } 239.6 μs 5.26 μs 15.26 μs 239.4 μs 1.00 0.09 3.63 KB 0.82
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 10000 } 420.7 μs 8.54 μs 24.50 μs 414.7 μs 1.00 0.00 10.02 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 10000 } 433.5 μs 9.24 μs 26.80 μs 429.8 μs 1.03 0.09 3.63 KB 0.36
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 100000 } 2,108.7 μs 40.17 μs 70.35 μs 2,106.0 μs 1.00 0.00 64.88 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 100000 } 2,170.9 μs 43.42 μs 85.70 μs 2,166.8 μs 1.04 0.05 3.77 KB 0.06
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 1000000 } 19,287.6 μs 346.39 μs 355.71 μs 19,409.8 μs 1.00 0.00 613.77 KB 1.000
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 1000000 } 17,418.0 μs 695.88 μs 2,051.81 μs 15,980.7 μs 1.00 0.04 3.63 KB 0.006
WithStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 10000000 } 158,167.3 μs 1,603.96 μs 1,500.34 μs 158,519.8 μs 1.00 0.00 6105.7 KB 1.000
WithAmortizedStreamReader Params { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = 10000000 } 154,547.3 μs 1,875.71 μs 1,662.77 μs 154,990.1 μs 0.98 0.02 3.49 KB 0.001
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 100 } 208.6 μs 5.18 μs 15.10 μs 206.2 μs 1.00 0.00 17.07 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 100 } 201.3 μs 3.93 μs 10.88 μs 199.6 μs 0.97 0.08 16.76 KB 0.98
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 1000 } 206.8 μs 4.48 μs 13.08 μs 202.9 μs 1.00 0.00 16.74 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 1000 } 201.3 μs 3.92 μs 3.85 μs 201.0 μs 0.98 0.05 16.76 KB 1.00
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 10000 } 230.6 μs 5.28 μs 15.49 μs 229.7 μs 1.00 0.00 17.38 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 10000 } 231.8 μs 4.62 μs 13.49 μs 230.2 μs 1.01 0.08 16.76 KB 0.96
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 100000 } 419.1 μs 10.84 μs 31.96 μs 416.6 μs 1.00 0.00 20.82 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 100000 } 475.0 μs 10.13 μs 29.56 μs 476.9 μs 1.14 0.10 16.76 KB 0.80
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 1000000 } 2,073.3 μs 64.42 μs 188.94 μs 2,053.2 μs 1.00 0.00 55.2 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 1000000 } 2,415.5 μs 47.36 μs 69.42 μs 2,419.2 μs 1.26 0.10 16.76 KB 0.30
WithStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 10000000 } 13,484.0 μs 219.72 μs 335.54 μs 13,433.0 μs 1.00 0.00 398.45 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = 10000000 } 13,512.2 μs 221.65 μs 324.88 μs 13,466.0 μs 1.00 0.03 16.76 KB 0.04
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 100 } 200.6 μs 4.50 μs 13.14 μs 199.2 μs 1.00 0.00 4.82 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 100 } 201.5 μs 4.59 μs 13.33 μs 198.5 μs 1.01 0.09 4.51 KB 0.94
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 1000 } 222.7 μs 4.83 μs 14.16 μs 219.0 μs 1.00 0.00 5.13 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 1000 } 224.0 μs 4.75 μs 13.86 μs 221.5 μs 1.01 0.08 4.51 KB 0.88
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 10000 } 332.3 μs 8.26 μs 24.22 μs 326.0 μs 1.00 0.00 7.95 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 10000 } 339.9 μs 6.75 μs 17.54 μs 338.1 μs 1.02 0.09 4.23 KB 0.53
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 100000 } 1,216.2 μs 24.19 μs 54.61 μs 1,216.1 μs 1.00 0.00 35.45 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 100000 } 1,274.9 μs 25.18 μs 61.28 μs 1,275.3 μs 1.05 0.07 4.51 KB 0.13
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 1000000 } 10,293.7 μs 205.02 μs 427.96 μs 10,247.7 μs 1.00 0.00 310.13 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 1000000 } 10,100.0 μs 200.39 μs 246.10 μs 10,111.0 μs 0.99 0.06 4.51 KB 0.01
WithStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 10000000 } 82,635.0 μs 1,208.63 μs 1,130.56 μs 82,374.9 μs 1.00 0.00 3056.92 KB 1.000
WithAmortizedStreamReader Params { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = 10000000 } 79,921.6 μs 852.58 μs 797.51 μs 79,840.2 μs 0.97 0.02 4.83 KB 0.002
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 100 } 203.5 μs 4.94 μs 14.24 μs 201.0 μs 1.00 0.00 30.74 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 100 } 201.9 μs 4.16 μs 12.14 μs 199.3 μs 1.00 0.09 30.76 KB 1.00
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 1000 } 209.3 μs 5.29 μs 15.44 μs 206.9 μs 1.00 0.00 31.07 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 1000 } 199.1 μs 3.80 μs 9.74 μs 196.9 μs 0.94 0.08 30.76 KB 0.99
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 10000 } 234.3 μs 4.72 μs 13.83 μs 234.0 μs 1.00 0.00 31.1 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 10000 } 230.7 μs 4.60 μs 12.27 μs 228.7 μs 0.99 0.07 30.48 KB 0.98
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 100000 } 356.4 μs 7.70 μs 22.46 μs 352.6 μs 1.00 0.00 32.95 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 100000 } 375.4 μs 7.51 μs 20.69 μs 375.4 μs 1.06 0.08 30.76 KB 0.93
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 1000000 } 1,604.6 μs 51.24 μs 150.29 μs 1,653.6 μs 1.00 0.00 50.13 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 1000000 } 1,781.7 μs 52.61 μs 155.12 μs 1,768.2 μs 1.12 0.12 30.76 KB 0.61
WithStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 10000000 } 8,841.9 μs 135.54 μs 218.88 μs 8,804.4 μs 1.00 0.00 222.01 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = 10000000 } 8,823.3 μs 167.63 μs 179.36 μs 8,807.5 μs 0.99 0.03 30.76 KB 0.14
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 100 } 207.7 μs 4.13 μs 11.99 μs 206.6 μs 1.00 0.00 6.24 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 100 } 208.3 μs 4.93 μs 14.15 μs 206.3 μs 1.01 0.09 6.26 KB 1.00
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 1000 } 224.1 μs 4.84 μs 14.21 μs 221.9 μs 1.00 0.00 6.27 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 1000 } 221.4 μs 4.66 μs 13.61 μs 220.5 μs 0.99 0.09 5.65 KB 0.90
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 10000 } 282.8 μs 5.56 μs 10.02 μs 282.0 μs 1.00 0.00 8.13 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 10000 } 290.2 μs 5.80 μs 13.20 μs 287.2 μs 1.03 0.05 6.26 KB 0.77
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 100000 } 777.3 μs 18.03 μs 52.02 μs 773.0 μs 1.00 0.00 21.88 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 100000 } 817.3 μs 16.33 μs 43.58 μs 820.0 μs 1.06 0.09 6.26 KB 0.29
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 1000000 } 5,203.1 μs 103.19 μs 126.72 μs 5,165.0 μs 1.00 0.00 159.2 KB 1.00
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 1000000 } 5,643.4 μs 107.71 μs 115.25 μs 5,622.9 μs 1.08 0.03 6.26 KB 0.04
WithStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 10000000 } 43,358.4 μs 592.79 μs 608.75 μs 43,341.7 μs 1.00 0.00 1532.51 KB 1.000
WithAmortizedStreamReader Params { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = 10000000 } 43,084.6 μs 819.14 μs 804.50 μs 43,103.1 μs 0.99 0.02 6.26 KB 0.004

Using the following Benchmark:

using System.Buffers;
using System.Runtime.CompilerServices;
using System.Text;
using BenchmarkDotNet.Attributes;

namespace Benchmark;

[MemoryDiagnoser]
[GcForce]
public class ReadAsyncPoolingBenchmark
{
    [ParamsSource(nameof(GetParams))]
    public Params Parameters;
    public int ReaderBufferSize => Parameters.ReaderBufferSize;
    public int FileStreamBufferSize => Parameters.FileStreamBufferSize;
    public int ResultBufferSize => Parameters.ResultBufferSize;
    public int FileSize => Parameters.FileSize;

    [IterationSetup]
    public void Setup()
    {
        using (StreamWriter writer = new("test.txt", options: new FileStreamOptions { Mode = FileMode.Create, Access = FileAccess.Write, Share = FileShare.None}))
        {
            StringBuilder builder = new(FileSize, FileSize);
            for (int i = 0; i < FileSize; i++)
                builder.Append(Random.Shared.Next(0, 10));
            writer.Write(builder);
        }
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        File.Delete("test.txt");
    }

    [Benchmark(Baseline = true)]
    public async ValueTask WithStreamReader()
    {
        using StreamReader stream =
            new(
                File.Open("test.txt",
                    options: new FileStreamOptions
                    {
                        Mode = FileMode.Open, Access = FileAccess.Read,
                        Options = FileOptions.Asynchronous | FileOptions.SequentialScan, Share = FileShare.Read,
                        BufferSize = FileStreamBufferSize
                    }), bufferSize: ReaderBufferSize);
        var inputBuffer = ArrayPool<char>.Shared.Rent(ResultBufferSize);
        while (true)
        {
            var mem = inputBuffer.AsMemory();
            int readCount;
            if ((readCount = await stream.ReadAsync(mem)) == 0) break;
            Print(inputBuffer, 0, readCount);
        }

        ArrayPool<char>.Shared.Return(inputBuffer);

        [MethodImpl(MethodImplOptions.NoInlining)]
        void Print(char[] buffer, int offset, int count)
        {
        }
    }

    [Benchmark]
    public async ValueTask WithAmortizedStreamReader()
    {
        using AmortizedStreamReader stream =
            new(
                File.Open("test.txt",
                    options: new FileStreamOptions
                    {
                        Mode = FileMode.Open, Access = FileAccess.Read,
                        Options = FileOptions.Asynchronous | FileOptions.SequentialScan, Share = FileShare.Read,
                        BufferSize = FileStreamBufferSize
                    }), bufferSize: ReaderBufferSize);
        var inputBuffer = ArrayPool<char>.Shared.Rent(ResultBufferSize);
        while (true)
        {
            var mem = inputBuffer.AsMemory();
            int readCount;
            if ((readCount = await stream.ReadAsync(mem)) == 0) break;
            Print(inputBuffer, 0, readCount);
        }

        ArrayPool<char>.Shared.Return(inputBuffer);[MethodImpl(MethodImplOptions.NoInlining)]

        void Print(char[] buffer, int offset, int count)
        {
        }
    }

    public readonly record struct Params(
        int ReaderBufferSize,
        int FileStreamBufferSize,
        int ResultBufferSize,
        int FileSize);

    public IEnumerable<Params> GetParams()
    {
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 128, FileStreamBufferSize = 512, ResultBufferSize = 256, FileSize = i };
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 256, FileStreamBufferSize = 1024, ResultBufferSize = 256, FileSize = i };
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 512, FileStreamBufferSize = 2048, ResultBufferSize = 256, FileSize = i };
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 1024, FileStreamBufferSize = 4096, ResultBufferSize = 256, FileSize = i };
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 2048, FileStreamBufferSize = 8192, ResultBufferSize = 256, FileSize = i };
        for (int i = 100; i <= 10_000_000; i *= 10)
            yield return new Params
                { ReaderBufferSize = 4096, FileStreamBufferSize = 16384, ResultBufferSize = 256, FileSize = i };
    }
}

Both versions seem to have similar performance within the margin of error however, the amortized version does seem to tend in the slower direction(maybe due too imperfect copy of the StreamReader and compilation outside of original assembly). For small files(smaller than internal buffer size) we don't see much difference in allocations but we still possibly pay the cost of creating a pool. If the file size gets larger we see the same allocations in every combination. I would say that with the default StreamReader configuration, pooling would pay off somewhere between 10kbyte and 100kbyte file size which isn't very large. Given that the performance is similar and that it will reduce GC pressure in most cases, I think it would at least be worth investigating further.