In WinSyncHttpClient, chunk size essentially is defined by bytesToRead.
Before the fix, the client subtracted 8 bytes for chunk metadata from an already allocated 8 KB buffer, resulting in a chunk size of less than 8 KB (8192 - 8 = 8184 bytes).
And, for example, AWS S3 service endpoint indeed returned an error when SDK PutObject() made multiple chunk upload (if using flexible checksums and object size > 8184):
trace log snippet
```
[TRACE] 2024-03-17 21:38:18.657 WinHttpSyncHttpClient [6492] Making PUT request to uri https://put-object-checksum2.s3.eu-west-2.amazonaws.com/object-name
...
[DEBUG] 2024-03-17 21:38:18.658 WinHttpSyncHttpClient [6492] with headers:
[DEBUG] 2024-03-17 21:38:18.658 WinHttpSyncHttpClient [6492] amz-sdk-invocation-id: 892204CA-DD86-43A6-8CE9-59BFCEBE55B6
amz-sdk-request: attempt=1
authorization: AWS4-HMAC-SHA256 Credential=.../20240317/eu-west-2/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-encoding;content-type;host;transfer-encoding;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-sdk-checksum-algorithm;x-amz-security-token;x-amz-trailer, Signature=398bd12331cc2fce53b0bd98bb399836811e34ae2020f363c018a85fa7e116e5
content-encoding: aws-chunked
content-type: binary/octet-stream
host: put-object-checksum2.s3.eu-west-2.amazonaws.com
transfer-encoding: chunked
user-agent: aws-sdk-cpp/1.11.273 ua/2.0 md/aws-crt#0.26.2 os/Windows#10.0.20348.1070 md/arch#AMD64 lang/c++#C++199711L md/MSVC#1929 cfg/retry-mode#default exec-env/EC2 api/S3
x-amz-content-sha256: STREAMING-UNSIGNED-PAYLOAD-TRAILER
x-amz-date: 20240317T213818Z
x-amz-decoded-content-length: 8185
x-amz-sdk-checksum-algorithm: CRC32C
x-amz-security-token: ...
x-amz-trailer: x-amz-checksum-crc32c
[DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received response code 403
[DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received content type application/xml
[DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received headers:
[DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] HTTP/1.1 403 Forbidden
Connection: close
Date: Sun, 17 Mar 2024 21:38:18 GMT
Transfer-Encoding: chunked
Content-Type: application/xml
Server: AmazonS3
x-amz-request-id: HAHMMX0XV8YV55BE
x-amz-id-2: f3XrHIrUlDTkGMQtN9/WYDrbJzmWBfBKO4H7GIfvDp9NP6UP1hNUBTmXlfWPxDVaH9XwssbHOAX2ld5gYf4rtg==
...
[DEBUG] 2024-03-17 21:38:18.678 AWSClient [6492] Request returned error. Attempting to generate appropriate error codes from response
[TRACE] 2024-03-17 21:38:18.678 AWSErrorMarshaller [6492] Error response is
InvalidChunkSizeErrorOnly the last chunk is allowed to have a size less than 8192 bytes28184HAHMMX0XV8YV55BEf3XrHIrUlDTkGMQtN9/WYDrbJzmWBfBKO4H7GIfvDp9NP6UP1hNUBTmXlfWPxDVaH9XwssbHOAX2ld5gYf4rtg==
[WARN] 2024-03-17 21:38:18.679 AWSErrorMarshaller [6492] Encountered Unknown AWSError 'InvalidChunkSizeError': Only the last chunk is allowed to have a size less than 8192 bytes
[ERROR] 2024-03-17 21:38:18.679 AWSXmlClient [6492] HTTP response code: 403
Resolved remote host IP address:
Request ID: HAHMMX0XV8YV55BE
Exception name: InvalidChunkSizeError
Error message: Unable to parse ExceptionName: InvalidChunkSizeError Message: Only the last chunk is allowed to have a size less than 8192 bytes
7 response headers:
connection : close
content-type : application/xml
date : Sun, 17 Mar 2024 21:38:18 GMT
server : AmazonS3
transfer-encoding : chunked
x-amz-id-2 : f3XrHIrUlDTkGMQtN9/WYDrbJzmWBfBKO4H7GIfvDp9NP6UP1hNUBTmXlfWPxDVaH9XwssbHOAX2ld5gYf4rtg==
x-amz-request-id : HAHMMX0XV8YV55BE
```
Proposed fix adds 8 bytes at the moment of buffer allocation.
Because WinSyncHttpClient::StreamPayloadToRequest() currently handles aws-chunked encoding, its streamBuffer should be prepared to handle both chunked and non-chunked payloads. A tradeoff for non-chunked is extra 8 bytes allocated on the stack, addressing this using alloca() or vector would create more problems than it solves.
Possible alternative is to use multiple DoWriteData() calls or even move aws-chunked content encoding handling to DoWriteData(), as is already done for chunked transfer encoding. But this would be a bigger change.
Another alternative, easier to implement but probably harder to justify, is simply:
Basically, use any buffer length that’s bigger than 8 KB + chunked metadata length. This is how CURL client works despite having the same logic. I just took 64 KB because it’s recommended in the linked AWS doc.
I think this can be tested only with integration tests. Personally, I tested with modified BucketAndObjectOperationTest.TestFlexibleChecksums. If it’s ok then I can add that to this PR as well.
Check all that applies:
[x] Did a review by yourself.
[ ] Added proper tests to cover this PR. (If tests are not applicable, explain.)
[x] Checked if this PR is a breaking (APIs have been changed) change.
[x] Checked if this PR will not introduce cross-platform inconsistent behavior.
[x] Checked if this PR would require a ReadMe/Wiki update.
Check which platforms you have built SDK on to verify the correctness of this PR.
[ ] Linux
[x] Windows
[ ] Android
[ ] MacOS
[ ] IOS
[ ] Other Platforms
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Issue #, if available: N/A
Description of changes:
aws-chunked encoding documentation is pretty scarce, the best article that I could find - https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html - says the chunk size must be at least 8 KB.
In WinSyncHttpClient, chunk size essentially is defined by
bytesToRead
.Before the fix, the client subtracted 8 bytes for chunk metadata from an already allocated 8 KB buffer, resulting in a chunk size of less than 8 KB (8192 - 8 = 8184 bytes).
And, for example, AWS S3 service endpoint indeed returned an error when SDK PutObject() made multiple chunk upload (if using flexible checksums and object size > 8184):
trace log snippet
``` [TRACE] 2024-03-17 21:38:18.657 WinHttpSyncHttpClient [6492] Making PUT request to uri https://put-object-checksum2.s3.eu-west-2.amazonaws.com/object-name ... [DEBUG] 2024-03-17 21:38:18.658 WinHttpSyncHttpClient [6492] with headers: [DEBUG] 2024-03-17 21:38:18.658 WinHttpSyncHttpClient [6492] amz-sdk-invocation-id: 892204CA-DD86-43A6-8CE9-59BFCEBE55B6 amz-sdk-request: attempt=1 authorization: AWS4-HMAC-SHA256 Credential=.../20240317/eu-west-2/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;content-encoding;content-type;host;transfer-encoding;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-sdk-checksum-algorithm;x-amz-security-token;x-amz-trailer, Signature=398bd12331cc2fce53b0bd98bb399836811e34ae2020f363c018a85fa7e116e5 content-encoding: aws-chunked content-type: binary/octet-stream host: put-object-checksum2.s3.eu-west-2.amazonaws.com transfer-encoding: chunked user-agent: aws-sdk-cpp/1.11.273 ua/2.0 md/aws-crt#0.26.2 os/Windows#10.0.20348.1070 md/arch#AMD64 lang/c++#C++199711L md/MSVC#1929 cfg/retry-mode#default exec-env/EC2 api/S3 x-amz-content-sha256: STREAMING-UNSIGNED-PAYLOAD-TRAILER x-amz-date: 20240317T213818Z x-amz-decoded-content-length: 8185 x-amz-sdk-checksum-algorithm: CRC32C x-amz-security-token: ... x-amz-trailer: x-amz-checksum-crc32c [DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received response code 403 [DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received content type application/xml [DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] Received headers: [DEBUG] 2024-03-17 21:38:18.677 WinHttpSyncHttpClient [6492] HTTP/1.1 403 Forbidden Connection: close Date: Sun, 17 Mar 2024 21:38:18 GMT Transfer-Encoding: chunked Content-Type: application/xml Server: AmazonS3 x-amz-request-id: HAHMMX0XV8YV55BE x-amz-id-2: f3XrHIrUlDTkGMQtN9/WYDrbJzmWBfBKO4H7GIfvDp9NP6UP1hNUBTmXlfWPxDVaH9XwssbHOAX2ld5gYf4rtg== ... [DEBUG] 2024-03-17 21:38:18.678 AWSClient [6492] Request returned error. Attempting to generate appropriate error codes from response [TRACE] 2024-03-17 21:38:18.678 AWSErrorMarshaller [6492] Error response isInvalidChunkSizeError
Proposed fix adds 8 bytes at the moment of buffer allocation.
Because WinSyncHttpClient::StreamPayloadToRequest() currently handles aws-chunked encoding, its streamBuffer should be prepared to handle both chunked and non-chunked payloads. A tradeoff for non-chunked is extra 8 bytes allocated on the stack, addressing this using alloca() or vector would create more problems than it solves.
Possible alternative is to use multiple DoWriteData() calls or even move aws-chunked content encoding handling to DoWriteData(), as is already done for chunked transfer encoding. But this would be a bigger change.
Another alternative, easier to implement but probably harder to justify, is simply:
Basically, use any buffer length that’s bigger than 8 KB + chunked metadata length. This is how CURL client works despite having the same logic. I just took 64 KB because it’s recommended in the linked AWS doc.
I think this can be tested only with integration tests. Personally, I tested with modified BucketAndObjectOperationTest.TestFlexibleChecksums. If it’s ok then I can add that to this PR as well.
Check all that applies:
Check which platforms you have built SDK on to verify the correctness of this PR.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.