Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.17k stars 4.53k forks source link

Should not check response body when a DELETE LRO return 204 #44640

Open ArthurMa1978 opened 2 weeks ago

ArthurMa1978 commented 2 weeks ago

According to https://github.com/Azure/azure-resource-manager-rpc/blob/master/v1.0/async-api-reference.md#delete-resource-asynchronously, the LRO DELETE operation can return 204 NoContent to indicate success, that means the response body is empty. However the Azure.Core code still try to extract the status from the response body, that will trigger RequestFailedException exception, the code in Azure.Core is here https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/core/Azure.Core/src/Shared/NextLinkOperationImplementation.cs#L531.

The issue raised from the new Storage release, which the LRO DELETE operation of the new Task resource returns 204, test code is here https://github.com/Azure/azure-sdk-for-net/pull/44571/files#diff-ebeebbcae04850fff767b20e818a0857ed54aaa6143f3ee3087497aefd3791ceR190

This block Storage team to release the new stable version of the .Net SDK by end of June.

github-actions[bot] commented 2 weeks ago

Thank you for your feedback. Tagging and routing to the team member best able to assist.

live1206 commented 2 weeks ago

@blueww is the contact from storage team

jsquire commented 2 weeks ago

@ArthurMa1978: There's a check on L533 that tests to see if the response body has content available:

if (response.Status is >= 200 and <= 204)
{
    if (response.ContentStream is { Length: > 0 })
    {
        // Extract content
    }
}

In the case of a 204 with no content returned, the length of the stream should be 0. That will prevent content from being read. If you're seeing an error extracting content in the 204 case, that would seem to indicate that the service is returning content that we cannot parse despite the response code saying that there is none. The latest swagger seems to

Please provide an example of the response body and what the content stream is populated where you're seeing this error.

jsquire commented 2 weeks ago

Digging into the Storage PR, it looks like this is related to /tests/Tests/StorageTaskAssignmentTests::CreateUpdateGetDeleteTaskAssignment on L190. That currently corresponds to this test recording.

The response body in the recording indicates that the status code was 204 and the response body was null. This would then fall to L571 and validate there is no location header. The recording does not indicate a location header being present, so it the clause headerSource == HeaderSource.None on L572 would be satisfied, returning true rather than an error.

There seems to be something more going on, though it is unclear what. We need to isolate repro in a stand-alone manner for further investigation.

@ArthurMa1978 / @blueww : Please provide the full exception message that you're seeing, along with the stack trace.

github-actions[bot] commented 2 weeks ago

Hi @ArthurMa1978. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.

ArthurMa1978 commented 1 week ago

@ArthurMa1978: There's a check on L533 that tests to see if the response body has content available:

if (response.Status is >= 200 and <= 204)
{
    if (response.ContentStream is { Length: > 0 })
    {
        // Extract content
    }
}

In the case of a 204 with no content returned, the length of the stream should be 0. That will prevent content from being read. If you're seeing an error extracting content in the 204 case, that would seem to indicate that the service is returning content that we cannot parse despite the response code saying that there is none. The latest swagger seems to

Please provide an example of the response body and what the content stream is populated where you're seeing this error.

The response.ContentStream contains both header and body, so the Length > 0 here is true.

ArthurMa1978 commented 1 week ago

Digging into the Storage PR, it looks like this is related to /tests/Tests/StorageTaskAssignmentTests::CreateUpdateGetDeleteTaskAssignment on L190. That currently corresponds to this test recording.

The response body in the recording indicates that the status code was 204 and the response body was null. This would then fall to L571 and validate there is no location header. The recording does not indicate a location header being present, so it the clause headerSource == HeaderSource.None on L572 would be satisfied, returning true rather than an error.

There seems to be something more going on, though it is unclear what. We need to isolate repro in a stand-alone manner for further investigation.

@ArthurMa1978 / @blueww : Please provide the full exception message that you're seeing, along with the stack trace.

The headerSource is HeaderSource.AzureAsyncOperation, so it fall to L578, the exception is:

Source=Azure.ResourceManager.Storage StackTrace: at Azure.Core.OperationInternal1.GetResponseFromState(OperationState1 state) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalOfT.cs:line 285 at Azure.Core.OperationInternal1.<UpdateStatusAsync>d__20.MoveNext() in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalOfT.cs:line 275 at System.Threading.Tasks.ValueTask1.get_Result() at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\TaskExtensions.cs:line 53 at Azure.Core.OperationInternalBase.UpdateStatus(CancellationToken cancellationToken) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalBase.cs:line 105 at Azure.Core.OperationInternal.<UpdateStatusAsync>d__9.MoveNext() in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternal.cs:line 110 at System.Threading.Tasks.ValueTask1.get_Result() at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\TaskExtensions.cs:line 53 at Azure.Core.OperationInternalBase.UpdateStatus(CancellationToken cancellationToken) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalBase.cs:line 105 at Azure.Core.OperationPoller.<WaitForCompletionAsync>d__11.MoveNext() in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationPoller.cs:line 83 at System.Threading.Tasks.ValueTask1.get_Result() at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\TaskExtensions.cs:line 53 at Azure.Core.OperationPoller.WaitForCompletionResponse(OperationInternalBase operation, Nullable1 delayHint, CancellationToken cancellationToken) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationPoller.cs:line 35 at Azure.Core.OperationInternalBase.d__19.MoveNext() in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalBase.cs:line 202 at System.Threading.Tasks.ValueTask1.get_Result() at Azure.Core.Pipeline.TaskExtensions.EnsureCompleted[T](ValueTask1 task) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\TaskExtensions.cs:line 53 at Azure.Core.OperationInternalBase.WaitForCompletionResponse(CancellationToken cancellationToken) in D:\code\srpsdk\sdk\core\Azure.Core\src\Shared\OperationInternalBase.cs:line 165 at Azure.ResourceManager.Storage.StorageArmOperation.WaitForCompletionResponse(CancellationToken cancellationToken) in D:\code\srpsdk\sdk\storage\Azure.ResourceManager.Storage\src\Generated\LongRunningOperation\StorageArmOperation.cs:line 83 at Azure.ResourceManager.Storage.StorageTaskAssignmentResource.Delete(WaitUntil waitUntil, CancellationToken cancellationToken) in D:\code\srpsdk\sdk\storage\Azure.ResourceManager.Storage\src\Generated\StorageTaskAssignmentResource.cs:line 251 ... ... [Call Stack Truncated]

jsquire commented 1 week ago

@ArthurMa1978: According to that recording, the 204 response does not contain the Azure-AsyncOperation header. https://github.com/Azure/azure-sdk-assets/blob/net/storage/Azure.ResourceManager.Storage_3e3c747f5a/net/sdk/storage/Azure.ResourceManager.Storage/tests/SessionRecords/StorageTaskAssignmentTests/CreateUpdateGetDeleteTaskAssignement.json#L1086

jsquire commented 1 week ago

Oh, I think that I see what's going on now. I grabbed the wrong response. That 204 is from a GET operation. The DELETE operations in that recording all return either a 200 or 202. The one that I think correlates to this is a 202 that includes an Azure-AsyncOperation header and a null body.

https://github.com/Azure/azure-sdk-assets/blob/net/storage/Azure.ResourceManager.Storage_3e3c747f5a/net/sdk/storage/Azure.ResourceManager.Storage/tests/SessionRecords/StorageTaskAssignmentTests/CreateUpdateGetDeleteTaskAssignement.json#L1306