Azure / azure-sdk-for-net

This repository is for active development of the Azure SDK for .NET. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/dotnet/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-net.
MIT License
5.48k stars 4.81k forks source link

[BUG] ADLS SDK does not add Content-Length = 0 for flush requests explicitly #43162

Open cool-mist opened 7 months ago

cool-mist commented 7 months ago

Overview

The c# SDK code to flush data is buggy, but the bug is masked by an implementation detail of HttpClientHandler.

Library name and version

Azure.Storage.Files.DataLake : 12.17.1 Possibly affects all earlier versions

Details

When using the FlushAsync, the c sharp SDK does not explicitly add the Content-Length = 0 that is documented in the rest api documentation -

Content-Length -- Required for "Append Data" and "Flush Data". Must be 0 for "Flush Data". Must be the length of the request content in bytes for "Append Data".

However, the SDK code CreateFlushDataRequest does not take care of adding the Content-Length header. For flush, it should always be set to 0.

This is not an issue when using the SDK under default settings, when the HttpClientTransport is not overridden. When it is not overridden, the SDK uses the default HttpClientHandler that adds the Content-Length header with value 0 when the header is not already present, in both .NET Framework and .NET

.NET

It uses SocketsHttpHandler by default and this line adds the Content-Length header with value 0

.NET Framework

It uses [HttpClientHandler]() by default and this line performs similar handling.

The problem

When configuring non-default transports (such as WinHttpHandler), calling flush results in SDK throwing the following exception

Unhandled exception. Azure.RequestFailedException: An HTTP header that's mandatory for this request is not specified.
RequestId:572f3c83-301f-006e-0521-857ef0000000
Time:2024-04-02T17:13:11.6888595Z
Status: 400 (An HTTP header that's mandatory for this request is not specified.)
ErrorCode: MissingRequiredHeader

Additional Information:
HeaderName: Content-Length

Content:
{"error":{"code":"MissingRequiredHeader","message":"An HTTP header that's mandatory for this request is not specified.\nRequestId:572f3c83-301f-006e-0521-857ef0000000\nTime:2024-04-02T17:13:11.6888595Z","detail":{"HeaderName":"Content-Length"}}}

Reproduction Steps

Added it here as a gist.

Platform : Dotnet 8

Dependencies

<PackageReference Include="Azure.Identity" Version="1.10.4" />
<PackageReference Include="Azure.Storage.Files.DataLake" Version="12.17.1" />
<PackageReference Include="System.Net.Http" Version="4.3.4" />
<PackageReference Include="System.Net.Http.WinHttpHandler" Version="8.0.0" />

Running dotnet run will succeed as it uses the defaults. Running dotnet run -- w will fail as it uses the WinHttpHandler.

The logs print the request message before being passed on to the underlying handler. In both cases, we can see that for the Flush request, Content-Length header is not added explicitly.

Request: PATCH https://sndeltest.dfs.core.windows.net/test/test/test.txt?action=append&position=0
x-ms-version: 2023-11-03
Accept: application/json
x-ms-client-request-id: b6843cbe-f2ef-4934-8be2-52e8ef30c41e
x-ms-return-client-request-id: true
User-Agent: azsdk-net-Storage.Files.DataLake/12.17.1, (.NET 8.0.0; Microsoft Windows 10.0.22631)

The following exception will be printed when running dotnet run -- w

Unhandled exception. Azure.RequestFailedException: An HTTP header that's mandatory for this request is not specified.
RequestId:572f3c83-301f-006e-0521-857ef0000000
Time:2024-04-02T17:13:11.6888595Z
Status: 400 (An HTTP header that's mandatory for this request is not specified.)
ErrorCode: MissingRequiredHeader

Additional Information:
HeaderName: Content-Length

Content:
{"error":{"code":"MissingRequiredHeader","message":"An HTTP header that's mandatory for this request is not specified.\nRequestId:572f3c83-301f-006e-0521-857ef0000000\nTime:2024-04-02T17:13:11.6888595Z","detail":{"HeaderName":"Content-Length"}}}

Fix

Explicitly add Content-Length = 0 in this method that is responsible for creating the FlushRequest.

Other notes

The current SDK code is accepting a long? contentLength as an argument while creating a http request for the flush operation. That is unnecessary, as it should always be 0 anyways.

internal HttpMessage CreateFlushDataRequest(int? timeout, long? position, bool? retainUncommittedData, bool? close, long? contentLength, byte[] contentMD5, string leaseId, DataLakeLeaseAction? leaseAction, long? leaseDuration, string proposedLeaseId, string cacheControl, string contentType, string contentDisposition, string contentEncoding, string contentLanguage, string ifMatch, string ifNoneMatch, DateTimeOffset? ifModifiedSince, DateTimeOffset? ifUnmodifiedSince, string encryptionKey, string encryptionKeySha256, EncryptionAlgorithmTypeInternal? encryptionAlgorithm)

github-actions[bot] commented 7 months ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @xgithubtriage.

cool-mist commented 7 months ago

Not sure if the concerned azure sdk teams are notified, or if the teams care about issues reported on Github. This is a 30 second fix on the SDK code as outlined in the issue description.

For anyone else coming across this issue, a workaround of manually setting the Content to string.Empty works whenever this issue is seen


request.Content = new StringContent(string.Empty);

This should be set through a custom DelegatingHandler that is part of the http pipeline constructed and passed in to override the SDK transport.

cool-mist commented 4 months ago

Any plans to fix this ?