dotnet / aspire

Tools, templates, and packages to accelerate building observable, production-ready apps
https://learn.microsoft.com/dotnet/aspire
MIT License
3.91k stars 475 forks source link

Data corruption when reading HttpClient content stream #6745

Open mhenry07 opened 2 days ago

mhenry07 commented 2 days ago

Description

I've observed an issue where data corruption occurs when using a certain pattern to read and process the stream from HttpClient.Content.

The pattern involves using a buffer to read a large stream from HttpClient.Content while it's downloading, and within a loop filling that buffer from the stream then doing some processing on each line and performing occasional I/O after a "batch" of lines has been received. In certain cases, I am regularly seeing intermittent lines being corrupted after some megabytes have been processed.

There is a chance that the issue could be on the ASP.NET side, but since the behavior seems so dependent on the read pattern and also occurs against fake-gcs-server, I suspect the issue is on the client side, and in particular I suspect it's related to the HttpClient.Content stream.

Example log message when data corruption occurs in the reproduction (note how the actual text differs from the expected text):

Reader: Line was corrupted at row 73,036, ~6,416,181 bytes:
Actual:    'abc,73036,def,CCC,ghi,01/01/0001 00:01:13abc,73032,def,YYYYYYYYYYYY,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:12 +00:00,mno'
Expected:  'abc,73036,def,CCC,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:16 +00:00,mno'

Reproduction Steps

I created a solution which reproduces the issue. It's on GitHub: mhenry07/http-content-stream-repro, and I attached a ZIP file: HttpContentStreamRepro-v1.zip.

Example error:

Reader: Line was corrupted at row 73,036, ~6,416,181 bytes:
Actual:    'abc,73036,def,CCC,ghi,01/01/0001 00:01:13abc,73032,def,YYYYYYYYYYYY,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:12 +00:00,mno'
Expected:  'abc,73036,def,CCC,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:16 +00:00,mno'

There are a few options for comparing how different values and variations affect behavior, which can be set in HttpContentStreamRepro.Console/Program.cs (see the README.md).

Rough pseudocode to illustrate the idea:

var buffer = new byte[ChunkSize];
var offset = 0;
var row = 0L;
var response = await httpClient.GetAsync("/values.csv", HttpCompletionOption.ResponseHeadersRead);
var stream = await response.Content.ReadAsStreamAsync();
while (true)
{
    var length = await ReadUntilFullAsync(stream, buffer.AsMemory(offset));
    if (offset + length == 0) break;
    var memory = buffer.AsMemory(0, offset + length);
    while (TryReadLine(ref memory, out var line))
        if (row++ % 100 == 0)
            await Task.Delay(15);

    offset = memory.Length;
    if (memory.Length > 0) memory.CopyTo(buffer);
}

static async ValueTask<int> ReadUntilFullAsync(Stream stream, Memory<byte> buffer)
{
    var count = 0;
    while (count < buffer.Length)
    {
        var read = await stream.ReadAsync(buffer[count..]);
        if (read == 0) break;
        count += read;
    }
    return count;
}

I've observed the same behavior when using System.IO.Pipelines to consume the stream, and that and more variations are implemented at another GitHub repository: mhenry07/aspire-repro.

Pseudocode for alternate System.IO.Pipelines reproduction (see PipeBuffer.cs in my alternate repo):

async Task FillPipeAsync(PipeWriter writer)
{
    var buffer = new byte[ChunkSize];
    using var writerStream = writer.AsStream();
    var response = await httpClient.GetAsync("/values.csv", HttpCompletionOption.ResponseHeadersRead);
    using var responseStream = await response.Content.ReadAsStreamAsync();
    while (true)
    {
        var length = await ReadUntilFullAsync(responseStream, buffer);
        if (length == 0) break;
        await writerStream.WriteAsync(buffer.AsMemory(0, length));
    }
    await writer.CompleteAsync();
}

async Task ReadPipeAsync(PipeReader reader)
{
    var row = 0L;
    while (true)
    {
        var result = await reader.ReadAsync(default);
        var buffer = result.Buffer;
        while (TryReadLine(ref buffer, out var line))
            if (row++ % 100 == 0)
                await Task.Delay(15);

        reader.AdvanceTo(buffer.Start, buffer.End);
        if (result.IsCompleted) break;
    }

    await reader.CompleteAsync();
}

Expected behavior

Reading the stream should not result in data corruption and it should not be necessary to load the full contents of a large response into memory before processing it

Actual behavior

After reading multiple megabytes, some lines get corrupted.

Example error logs showing actual versus expected lines when the issue occurs:

Reader: Line was corrupted at row 73,036, ~6,416,181 bytes:
Actual:    'abc,73036,def,CCC,ghi,01/01/0001 00:01:13abc,73032,def,YYYYYYYYYYYY,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:12 +00:00,mno'
Expected:  'abc,73036,def,CCC,ghi,01/01/0001 00:01:13 +00:00,jkl,01/01/0001 20:17:16 +00:00,mno'
Reader: Line was corrupted at row 72,678, ~6,384,622 bytes:
Actual:    'abc,72663,def,TTTTTTT,ghi,01/01/0001 00:01:12 +00:00,jkl,01/01/0001 20:11:03 +00:00,mno'
Expected:  'abc,72678,def,IIIIIIIII,ghi,01/01/0001 00:01:12 +00:00,jkl,01/01/0001 20:11:18 +00:00,mno'

Regression?

I don't think so. I first observed this just after updating Visual Studio to 17.12 with the debugger attached and originally filed Visual Studio feedback. However, I have since been able to reproduce the issue on VS 17.11, with and without the debugger attached, and in debug and release builds. I now suspect it's a .NET issue rather than Visual Studio.

Known Workarounds

Configuration

Also, observed with the Visual Studio debugger attached and detached, with Visual Studio 17.11 and 17.12.

Other information

I also have another GitHub repository which has more examples of implementations that reproduce the issue and don't reproduce the issue: mhenry07/aspire-repro.

dotnet-policy-service[bot] commented 2 days ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

rzikm commented 2 days ago

I can see that the data are sent correctly (it even looks correct when inspected via wireshark), but the data on the remote side arrives corrupted (I dumped the incoming stream to a file and compared the with the stream produced on the Web project), and always around the same mark.

Interestingly enough, I can't reproduce this outside of .NET Aspire (i.e. running the two projects in separate console windows.

rzikm commented 1 day ago

Interesting accidental find (I left the web project running while starting aspire via apphost): .NET aspire uses some sort of proxy to connect individual projects

Image

Image

My suspicions are, that the proxy is causing the issue

rzikm commented 1 day ago

Confirmed:

Error: Image

Web is listening on port 62683 Image

Raw HTTP/1.1 between ports 62683 and 62689 (Web<->dcpctrl), Image The frames carrying the line 49554 are at the bottom of the screenshot and are uncorrupted

Raw HTTP/1.1 between ports 62688 and 5251 (Console<->dcpctrl) Image Notice the group second from the top carrying part of line 49554, the data after that continue with data corresponding to 49540 (and 49541 follows after that).

rzikm commented 1 day ago

Zip with captures of the above data-corruption.zip

sebastienros commented 1 day ago

@karolz-ms

mhenry07 commented 1 day ago

@rzikm Thank you for digging into that with wireshark. I'll have to try running without Aspire, I didn't notice the proxy before. It seems interesting that a proxy would be affected by the read pattern. I appreciate the confirmation of the issue. It's a relief if the issue is isolated to the proxy.

karolz-ms commented 1 day ago

Sounds likely the Aspire proxy is the culprit. @sebastienros do we want to keep this issue here, or transfer to Aspire repo?

mhenry07 commented 1 day ago

To confirm @rzikm's observation, I'm also not seeing the issue when I change the client base address to http://localhost:5251, comment out the builder.AddServiceDefaults(); line, and run the web and console apps without running the Aspire AppHost.

Alternatively, the "web" registration in the AppHost Program.cs can be changed to the following to bypass the proxy and then run the AppHost project:

var web = builder.AddProject<Projects.HttpContentStreamRepro_Web>("web", launchProfileName: null)
    .WithHttpEndpoint(port: 5251, isProxied: false);

Image