aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
2.96k stars 557 forks source link

Race conditions in S3 upload #6115

Open jirimoravcik opened 1 month ago

jirimoravcik commented 1 month ago

Checkboxes for prior research

Describe the bug

We're getting errors from AWS SDK when using:

import { Upload } from '@aws-sdk/lib-storage';

We create an instance of S3 client via:

const s3Client = new S3({ apiVersion: '2016-04-01' });

then the code (where params is of type PutObjectCommandInput):

const upload = new Upload({
    client: s3Client,
    params,
});

await upload.done();

These are most likely race conditions in the SDK as we're getting weird errors at times when we perform many simultaneous uploads to S3. Errors we have observed:

SDK version number

"@aws-sdk/client-s3": "3.565.0", "@aws-sdk/lib-storage": "3.565.0"

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

20.12.2

Reproduction Steps

We upgraded our node js from 14 to 20 and these errors started appearing.

Observed Behavior

Errors we have observed, which seem to be race conditions:

  1. "name":"400","message":"UnknownError"
  2. "name":"XAmzContentSHA256Mismatch","message":"The provided 'x-amz-content-sha256' header does not match what was computed."
  3. "name":"Error","message":"char 'H' is not expected.:1:1\n Deserialization error: to see the raw response, inspect the hidden field {error}.$response on this object."

Expected Behavior

No errors happening.

Possible Solution

No response

Additional Information/Context

This issue started to appear when we upgraded Node.js from version 14 to 20.

It appears someone else reported this but was ignored, see https://github.com/aws/aws-sdk-js-v3/issues/5455

kuhe commented 1 month ago

Is this on Lambda?

many simultaneous uploads to S3

What does the concurrency code look like?

What are the JS class types and sizes of the uploaded bodies?

Please create a repository with an executable minimal reproduction of the issue.

jirimoravcik commented 1 month ago

Is this on Lambda?

many simultaneous uploads to S3

What does the concurrency code look like?

What are the JS class types and sizes of the uploaded bodies?

Please create a repository with an executable minimal reproduction of the issue.

It is a normal Node.js application running in Kubernetes (EKS). The concurrency code has ~100 promises running in parallel, each one executing code that uploads data to S3. The data is a JSON file of small size <1MB. Specifically, in the Node.js code, we use Readable.from(data), where data is a normal string with the JSON, which is then passed to Upload from @aws-sdk/lib-storage.

As for the minimal reproduction, we have this issue happening very rarely and haven't been able to get good reproduction code just yet - we're working on that.

On the other hand, the issue started happening when we upgraded Node.js from 14 to 20, we haven't touched the code we use or AWS SDK versions.

merlinio2000 commented 1 month ago

We experienced the same issue on lambda without any concurrency(except the one introduced by Uploads queueSize) at all. In some cases the lambda even seems to exit early (without error) while awaiting upload.done() in a loop. My current suspicion is that there is some kind of floating promise somewhere in client-s3/lib-storage because this would also explain the weird lambda behavior

Note: for us this only seems to happen when lambda runs a few requests sequentially in the same context, though because of its random nature its hard to say definitely

jirimoravcik commented 1 month ago

Hey, just posting a small update:

kuhe commented 1 month ago

Readable.from(data), where data is a normal string with the JSON ... of small size <1MB

I have an idea that doesn't help identify the problem, but as an aside, if the data is originally a string, i.e. buffered into memory already, you don't need to convert it to a Readable stream object to pass it to the SDK.

new Upload({
  params: {
    Bucket, Key, Body: "a string"
  }
}).done();

should work as well. In fact, if it's a small (<5mb) string, you might as well call

const s3 = new S3({});

s3.putObject({ Bucket, Key, Body: "a string" });

Because, Upload calls putObject for data smaller than the minimum multipart upload part size.

fnesveda commented 1 month ago

I've prepared a reproduction here: https://github.com/apify/s3-node-20-bug-repro

Please let me know if there are any issues running it.