dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.98k stars 4.66k forks source link

Stream returned by HttpResponseMessage.Content.ReadAsStream does not respect readtimeout or cancellationtoken .net framework #28306

Closed rahuldutta90 closed 4 years ago

rahuldutta90 commented 5 years ago

Hi This issue is found in net framework .net452.

I have the following sample code for reading data from server:

using(var webresponse = httpclient.SendAsync(request,HttpCompletionOption.ResponseHeadersRead,cancelToken).Configureawait(false)){
using (Stream opStream = await webResponse.Content.ReadAsStreamAsync().ConfigureAwait(false))
    {
      try
      {
         opStream.ReadTimeout = 60 * 1000;
        do{ 
         await opstream.ReadAsync(buffer,offset,count,cancelToken).ConfigureAwait(false);
        }while(...)
      }catch(IOException e){//this is where the exception is thrown}
    }
}

For some requests it gives following error "Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond".

After debugging I figured out that this line: "await opstream.ReadAsync" throws this ioexception. Now this error is not the problem.

Problem is it throws this error after 300 seconds which is the default readtimeout of the "WebExceptionWrapperStream.ReadTimeout" even though I have set it as 60 seconds (as shown above).

The cancelToken passed to the ReadAsync is from a cancellationtokensource which is set to cancel after 60 seconds.

Still it timesout after 300 seconds

Is there a way to set the timeout of ReadAsync?

rahuldutta90 commented 5 years ago

@davidsh Do you have any idea regarding this?

davidsh commented 5 years ago

This issue is found in net framework .net452.

On .NET Framework, HttpClient uses HttpWebRequest underneath. HttpWebRequest itself has a timeout property for read/write timeouts:

HttpWebRequest.ReadWriteTimeout

However, this timeout property is not exposed on the stream that you get from ReadAsStreamAsync(). And setting the generic Stream.ReadTimeout property is ignored (I think) for those streams.

You might consider modifying your code here:

await opstream.ReadAsync(buffer,offset,count,cancelToken).ConfigureAwait(false);

so that you put a separate timeout on the cancelToken itself, i.e. use the timeout properties of the CancellationTokenSource from which you obtained this cancelToken.

cc: @karelz

rahuldutta90 commented 5 years ago

@davidsh I do that also. As mentioned above "The cancelToken passed to the ReadAsync is from a cancellationtokensource which is set to cancel after 60 seconds." It does not honor that also. That is what I did before setting a timeout to the stream.

NOTE: This does not happen for httpwebrequest. I am migrating from httpwebrequest to httpclient and this did not happen for httpwebrequest. In httpwebrequest I used to do similar as above: generate a cancellation Token from a cancellationtokensource which is set to cancel after 60 seconds plus I use to register a action with this cancellation token and in that action abort httpwebrequest. So it timesout correctly after 60 seconds.

karelz commented 5 years ago

Is it problem unique to .NET Framework? We do not track .NET Framework problems on GitHub, although it is the same set of people ...

rahuldutta90 commented 5 years ago

@karelz @davidsh I am facing this issue in net core also. So here are my debug lines

NETCORE2.0

2019-01-10 22:36:05.0718|DEBUG|adls.dotnet.WebTransport|HTTPRequest,failed,cReqId:4b43ba68-e72a-4916-8ff7-87ce7abc0dd8.0,lat:321111,err:The read operation failed, see inner exception.,Reqlen:0,Resplen:0,token_ns:0,sReqId:c5de9c67-689b-43c0-a80a-299a9d96e764,path:/Powershell/rdutta/Test/100GB/8.dat,qp:op=OPEN&read=true&filesessionid=8001c86c-217a-43ee-b863-558b937b7300&offset=1010827264&length=4194304&api-version=2018-08-01, connectionfailure:True

NET framework:

    Line 1221: 2019-01-03 19:04:20.6355|DEBUG|adls.dotnet.WebTransport|cReqId:be063f2e-2700-4d0f-95f6-05e57ff8fc01.0, Reached before content
    Line 46808: 2019-01-03 19:09:38.1928|DEBUG|adls.dotnet.WebTransport|cReqId:be063f2e-2700-4d0f-95f6-05e57ff8fc01.0, Failed after content, opstreamtimeout:60000, linkedcancellationrequested:True, timeoutcancellationrequested: True
2019-01-03 19:09:38.2866|DEBUG|adls.dotnet.WebTransport|HTTPRequest,respstatus:OK,,extype:System.IO.IOException,failed,cReqId:be063f2e-2700-4d0f-95f6-05e57ff8fc01.0,lat:318123,err:Unable to read data from the transport connection: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond.,Reqlen:0,Resplen:0,token_ns:0,sReqId:88ad4bf0-868b-43ed-9235-72803cb1523d,path:/Powershell/rdutta/Test/100GB/80.dat,qp:op=OPEN&read=true&filesessionid=adbf991a-5fa6-457b-8494-06b5e550ab9f&offset=20971520&length=4194304&api-version=2018-08-01, connectionfailure:True

For the net framework debug lines above I have put debug lines before starting the ReadASync and after the ReadAsync. And after the readasync I also log the cancelattiontoken.IsRequested Value also which was set to expire after 60 seconds.

karelz commented 5 years ago

@rahuldutta90 I don't know what are those "debug lines". Do you have a minimal repro to demonstrate the problem? Note that .NET Core 2.0 is out of support - please try it on 2.2 or 2.1. Thanks!

rahuldutta90 commented 5 years ago

@karelz I run a test from a VM where I have 100 threads and on each thread I am downloading 1 1GB file (so across 100 threads i am downloading 100 1 gb files) from our storage. For this case my storage account and the VM are in cross region, that is why I encounter these timeouts. Each request is a read of 4MB, so we send multiple read requests for reading 1 1GB . Although my current SDK is in public, but the httpclient changes are in still in private repo. Let me write up a repro and try to share it. If I cannot share here, I will share it internally.

rahuldutta90 commented 5 years ago

@karelz @davidsh

i managed to write a small repo using a mock server using "WebListener" (this is simpler to understand than above live test) which did repo this problem for .net framework (i used .net472), however it did not repo for net core.

Httpclient code in my sdk - I managed to extract the httpclient code from my library which I feel relevant

using (HttpRequestMessage webReq = new HttpRequestMessage(new HttpMethod(op.Method), urlString))
                {

                    //If operation is cancelled then stop
                    cancelToken.ThrowIfCancellationRequested();
                    // Get the httpclient
                    var httpClient = client.GetHttpClient(req.UseCert);

                // Important here, i do a linked cancellation token, one token is from user and another token will cancel after 60 seconds
                    using (var timeoutCancellationTokenSource = GetCancellationTokenSourceForTimeout(req))
                    {
                        using (CancellationTokenSource linkedCts =
                      CancellationTokenSource.CreateLinkedTokenSource(cancelToken, timeoutCancellationTokenSource.Token))
                        {
                            try
                            {
                              // For requests like read that returns output stream i read the header only
                                using (var webResponse = op.ReturnsBody ? await httpClient.SendAsync(webReq, HttpCompletionOption.ResponseHeadersRead, linkedCts.Token).ConfigureAwait(false) : await httpClient.SendAsync(webReq, linkedCts.Token).ConfigureAwait(false))
                                {
                                    // Read the headers here
                                     .
                                     .
                                    .
                                   if ((int)resp.HttpStatus >= 300)
                                    {
                                        HandleWebException(webResponse, resp, path, req.RequestId, token, webReq);
                                        return null;
                                    }
                                    if (op.ReturnsBody)
                                    {

                                        int totalBytes = 0;
                                        try
                                        {
                                            using (Stream opStream = await webResponse.Content.ReadAsStreamAsync().ConfigureAwait(false))
                                            {

                                                int noBytes;
                                                int totalLengthToRead = responseData.Count;
                                                //Read the required amount of data. In case of chunked it is what users requested, else it is amount of data sent
                                                do
                                                {
                                               // NOTICE here i am passing the cancellation token after 60 seconds
                                                    noBytes = await opStream.ReadAsync(responseData.Data, responseData.Offset,
                                                        totalLengthToRead, linkedCts.Token).ConfigureAwait(false);
                                                    totalBytes += noBytes;
                                                    responseData.Offset += noBytes;
                                                    totalLengthToRead -= noBytes;

                                                } while (noBytes > 0 && totalLengthToRead > 0);
                                     }
                    }
                }
            }                

My server code:

            _webListener.Start();
            while (true)
            {
// This is just a mockresponse i queue to return as part of request
                MockResponse resp = GetMockResponse();
                if (resp == null)
                {
                    _webListener.Dispose();
                    return;
                }
                using (var context = _webListener.AcceptAsync().GetAwaiter().GetResult())
                {

                    context.Response.StatusCode = (int)resp.StatusCode.Value;
                    context.Response.ReasonPhrase = resp.StatusDescription;

                    {
                        byte[] bytes = new byte[4 * 1024 * 1024];
                        context.Response.ContentType = "application/json";
                        context.Response.Body.WriteAsync(bytes, 0, bytes.Length).Wait();
                        int i = 0;
                        while (i ++< 5)
                        {
                            Thread.Sleep(300 * 1000);
                            context.Response.Body.WriteAsync(bytes, 0, bytes.Length).Wait();
                        }
                    }
                }
            }

And then i just send a read request to this mock server which after sending the first 4 MB of data waits for 300 seconds. From the client i attempt to read more than 4MB bytes, so it gets stuck at read for "await opStream.ReadAsync" for 300 seconds in .net framework.

Again this test did not successfully reproduce this problem for .net core (since it timedout after 60 seconds correctly based on the cancellation token) which i was able to reproduce in my original live test. I will try to check my live test with .net core and see if there is a way to repro using a mock test as i did for .net framework. If I am not able to, then I have to setup something internally to show you guys my live test.

But for now please take a look at this for .net framework.

rahuldutta90 commented 5 years ago

@karelz @davidsh

CORRECTION: This issue is not in .netcore. I was wrong to mention that this issue exists in NET core. If you call ReadAsync method on the stream and pass the timeout cancellation token, it works fine.

This issue is only .NET framework, and I have given the mock reproduce code above. Please let me know if you need any more information.

rahuldutta90 commented 5 years ago

@karelz @davidsh any updates regarding this?

davidsh commented 5 years ago

@karelz @davidsh any updates regarding this?

I see that you posted additional code for a "mock" server that you use with the client-side HttpClient repro to demonstrate the problem on .NET Framework. However, you didn't post the complete "mock" server code.

Since you have a mock server now, it would be helpful if you could post a complete repro (Visual Studio solution) including both server and client code that we can use to reproduce the problem (simple F5 repro).

rahuldutta90 commented 5 years ago

@davidsh I have published my httpclient code in public and I also have a unittest to repro this. Here is the link: https://github.com/Azure/azure-data-lake-store-net/blob/7433a3462003b78cc09dd683ad13c96797440a3b/AdlsDotNetSDKUnitTest/CoreUnitTest.cs#L291

You can checkout our code: https://github.com/Azure/azure-data-lake-store-net/blob/preview-alpha and run the test above. Currently my unittest is net452, you can change the framework to net472 if you like.

Will this do?

If you have any questions regarding the code, you can ping me offline.

davidsh commented 5 years ago

https://github.com/Azure/azure-data-lake-store-net/blob/preview-alpha

The link is returning 404 Not Found.

I was also hoping you would submit a standalone repro without any of the Azure SDK code. For example, a standalone Visual Studio solution containing two projects (a console client application using HttpClient and the mock server). Otherwise, it isn't clear what the root cause actually is other than anecdotal evidence of a difference in behavior between .NET Framework vs. .NET Core.

Reducing down your Azure SDK repro to something that identifies the root cause in .NET Framework HTTP stack layer will take more time.

rahuldutta90 commented 5 years ago

@davidsh @karelz I have sent an email to you internally with the repo.

davidsh commented 5 years ago

I looked at the repro code that you sent. Thank you for providing that.

The short answer is the .NET Framework does not properly support the cancellation token that is passed into the ReadAsync() method for the response stream that you obtained from the HttpClient APIs. That is why the repro doesn’t stop the I/O after 10 seconds. However, .NET Core does support the cancellation token.

The reason is because the .NET Framework HTTP stack was originally written long before Tasks/Async/CancellationToken semantics. The underlying networking stack is using HttpWebRequest and Begin/End patterns for async operations.

The stream, opStream you obtained from this code:

using (Stream opStream = await webResponse.Content.ReadAsStreamAsync().ConfigureAwait(false))
{
    int noBytes;
    int totalLengthToRead = responseData.Count;
    // Read the required amount of data. In case of chunked it is what users requested
    // else it is amount of data sent
    // to be only used for net452 opStream.ReadTimeout = 60 * 1000;
    do
    {
         noBytes = await opStream.ReadAsync(
              responseData.Data,
              responseData.Offset,
              totalLengthToRead,
              linkedCts.Token).ConfigureAwait(false);
         totalBytes += noBytes;
         responseData.Offset += noBytes;
         totalLengthToRead -= noBytes;
     } while (noBytes > 0 && totalLengthToRead > 0);
}

is actually implemented as a System.Net.ConnectStream. image

Reference source of System.Net.ConnectStream from .NET Framework: https://github.com/Microsoft/referencesource/blob/master/System/net/System/Net/_ConnectStream.cs

That stream object doesn’t have any actual ReadAsync() method on it. It only has BeginRead() and EndRead(). So, the implementation of that virtual method falls to the base System.IO.Stream class. The default ReadAsync() method on System.IO.Stream() will wrap calls to the stream’s actual BeingRead() / EndRead() methods. The implementation of that code is that it ignores the cancellation token passed into the ReadAsync() method.

The only way to fix this would be for .NET Framework to change to actually implement the overridden ReadAsync() method on the System.Net.ConnectStream class implementation.

So, using a cancellation token won’t work for doing reads on the response body stream in this scenario on .NET Framework. However, the same scenario on .NET Core will work since ReadAsync() is fully implemented with cancellation token support.

davidsh commented 5 years ago

Closing since this is not a .NET Core issue and the behavior on .NET Framework is currently by-design.