Closed mingchuno closed 6 days ago
We've also encountered this crash on AWS Lambda, and even when queueSize
is set to 1. We've found it's much easier to reproduce with larger values of queueSize
though.
@rcausey Can you provide more details on your use case? You use the lambda to upload multipart file to S3 too? What do queueSize
mean? You are connecting SQS / AWS MQ to lambda?
@rcausey Can you provide more details on your use case? You use the lambda to upload multipart file to S3 too? What do
queueSize
mean? You are connecting SQS / AWS MQ to lambda?
Yes, we use @aws-sdk/lib-storage
to do multipart uploads to S3 on Lambda.
queueSize
is a parameter that controls concurrency in @aws-sdk/lib-storage
uploads; i.e.
const parallelUploads3 = new Upload({
client: new S3({}) || new S3Client({}),
tags: [...], // optional tags
queueSize: 4, // optional concurrency configuration
partSize: 5MB, // optional size of each part
leavePartsOnError: false, // optional manually handle dropped parts
params: target,
});
We first ran into this issue when we switched from v2 to the v3 SDK. We couldn't reproduce it when we set queueSize
to 1, so this is how we "fixed" it, since this was sufficient for our needs. But we recently encountered the issue in our production environment even with queueSize
set to 1, so it seems like it can still happen (albeit much less frequently) without concurrent uploads.
@rcausey Can we isolate a simple reproducer in cloud env that can (with high confident) reproduce the issue?
We are seeing this also. In our case, we have a Lambda handling a message from an SQS queue, and sometimes the handler dies with the error the OP described. I am pasting a stack trace from the CloudWatch log, but it looks like the node
binary you are running our code under is stripped (not that it shouldn't have been, but it doesn't help debug the issue).
How can we assist you in getting to the bottom of this? For example, can you guide us in building a custom Node.js runtime in which the node
binary isn't stripped but which is otherwise as close as we can get to the run-time you are using? It might be helpful to have a better stack trace, I am guessing...
Here's the stack, such as it is from a CloudWatch log for a run of Lambda where this issue hit:
/var/lang/bin/node[8]: ../src/node_http_parser.cc:510:static void node::{anonymous}::Parser::Execute(const v8::FunctionCallbackInfo<v8::Value>&): Assertion `parser->current_buffer_.IsEmpty()' failed.
1: 0x55913ada7673 node::Abort() [/var/lang/bin/node]
2: 0x55913ada7702 [/var/lang/bin/node]
3: 0x55913adc4750 [/var/lang/bin/node]
4: 0x55913afe3619 [/var/lang/bin/node]
5: 0x55913afe53a1 v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/var/lang/bin/node]
6: 0x55913b91a7d9 [/var/lang/bin/node]
[...]
RequestId: <id_elided> Error: Runtime exited with error: signal: aborted (core dumped)
Runtime.ExitError
I encountered the error as well. Found a good hint that it can be avoided, by setting the chunk size. It said the error happens when its exactly 5MB per upload.
Do we know the reason behind the chunksize hack?
On Tue, Jan 11, 2022 at 5:02 PM Martin Kuhn @.***> wrote:
nodejs/node#39671 (comment) https://github.com/nodejs/node/issues/39671#issuecomment-992693966
— Reply to this email directly, view it on GitHub https://github.com/aws/aws-sdk-js-v3/issues/2843#issuecomment-1010523684, or unsubscribe https://github.com/notifications/unsubscribe-auth/AD6E3KTYPVFTJETY4Y7VYD3UVTHI3ANCNFSM5E6NCK3Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
You are receiving this because you are subscribed to this thread.Message ID: @.***>
I think the assert is triggered when it's sending a Multipart Upload with at least 2 chunks and the body happens to be empty for some unknown reason. The scenario below is reproducing the error all the time. Weirdly enough, if I change the chunk size to 5.5 1024 1024 it has no error, but the log message indicates it is uploading the file twice? The file to upload is 7mb and the progress.loaded is 14mb after it finished
const file = await fs.open("myFile.json", 'r'); // file is 7mb
const stream = file.createReadStream({ encoding: null }); // get byte stream
await this.waitForReadable(stream);
await new Upload({
client: this.s3Client,
leavePartsOnError: false,
tags: args.tags,
params: {
Key: this.fullPath(args.fileName),
Bucket: this.bucketName,
Body: stream,
},
}).done();
/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node[76192]: ../src/node_http_parser.cc:517:static void node::(anonymous namespace)::Parser::Execute(const FunctionCallbackInfo<v8::Value> &): Assertion `parser->current_buffer_.IsEmpty()' failed.
1: 0x107274815 node::Abort() (.cold.1) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
2: 0x105f73aa9 node::Abort() [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
3: 0x105f738e1 node::Assert(node::AssertionInfo const&) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
4: 0x105f901ff node::(anonymous namespace)::Parser::Execute(v8::FunctionCallbackInfo<v8::Value> const&) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
5: 0x10615d239 v8::internal::FunctionCallbackArguments::Call(v8::internal::CallHandlerInfo) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
6: 0x10615cd06 v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::HeapObject>, v8::internal::Handle<v8::internal::FunctionTemplateInfo>, v8::internal::Handle<v8::internal::Object>, v8::internal::BuiltinArguments) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
7: 0x10615c47f v8::internal::Builtin_HandleApiCall(int, unsigned long*, v8::internal::Isolate*) [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
8: 0x1069cd399 Builtins_CEntry_Return1_DontSaveFPRegs_ArgvOnStack_BuiltinExit [/Users/martinkuhn/.nvm/versions/node/v16.13.1/bin/node]
^CWaiting for the debugger to disconnect...
Waiting for the debugger to disconnect...
My bad the error still happens, it still seems to be random :O What I noticed while debugging the library is that an empty PUT request is sent when the stream ended. Although it should never be triggered? Will debug tonight a bit more and double check.
https://github.com/aws/aws-sdk-js-v3/blob/main/lib/lib-storage/src/Upload.ts line 136
// Use put instead of multi-part for one chunk uploads.
if (dataPart.partNumber === 1 && dataPart.lastPart) {
return await this.__uploadUsingPut(dataPart);
}
Maybe its releated. I have called the done function twice when doing ´new Upload().done().done()´ This caused an empty stream being sent. Thus deleting the uploaded file again and perhaps a potential trigger for the Assert error.
Yep, I haven't encountered any error since I made sure .done() is never called twice for the same upload object.
I'm still encountering this issue. We've worked around it for now by creating a custom HttpHandler
that uses undici. Would AWS be interested in such a handler to workaround this issue, or is there still active research going into why this issue is happening?
I could still reproduce this bug using '@aws-sdk/lib-storage' library, but this was not being reproduced using AWS-SDK of 2 version. Steps to reproduce - create ~1500 of 0-length files. In that case it would crash randomly - on 50th file, 250th file, 853d file, etc. So this bug is not stable.
If you put some data to each file - 8KB-5MB - this bug isn't reproducible.
Surprisingly enough it's working pretty stable without any bugs with AWS-SDK 2 ver.
NodeJS v18.6.0 has a fix for nodejs/node#39671 (nodejs/node@491c7619c4)
Hi there,
I see this issue never got addressed by anyone from the SDK team. The last comment was made in 2022 and suggested a fix from Node's side.
Is this still an issue?
Thanks, Ran~
This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.
Confirm by changing [ ] to [x] below to ensure that it's a bug:
Describe the bug This is actually an issue in https://github.com/nodejs/node/issues/39671
When using
@aws-sdk/lib-storage
to perform multipart upload to S3. NodeJs throw the following exception:This is from
node/http
module. Interestingly, this only happen on cloud env but not local dev env work fine. (Further investigation is needed to verify). It run fine almost every time in my local machine (Macbook Pro 16" Intel running node v14.17.5) with only one time I have seen the exception. But in the production env running EKS over EC2, it fail consistancly at the same point no matter the data size.This MAY NOT be an issue of
@aws-sdk/lib-storage
but it seems to be only happen when using aws sdk together in cloud env.I have tried changing my application logic to only perform 1 upload at a time and it did not crash even in cloud env (AWS EKS running on EC2). So I suspect it is a concurrent issue of some sort in either aws-sdk or nodejs level. It would be great if we (user, aws-sdk, nodejs) can work together and find a re-producer first.
Is the issue in the browser/Node.js? Node.js and possible only happens on cloud (AWS?) ENV only
If on Node.js, are you running this on AWS Lambda? No
Details of the browser/Node.js version
SDK version number
@aws-sdk/lib-storage@3.33.0
for me and I am not sure othersTo Reproduce (observed behavior)
No reproducer at this moment. Looking for someone to reproduce it.
Expected behavior
No exception thrown
Screenshots
No
Additional context
Code snippet for reference