aws / aws-sdk-js-v3

Modularized AWS SDK for JavaScript.
Apache License 2.0
3.12k stars 579 forks source link

lib-storage progress reporting only after each chunk on server #5682

Open gauthierm opened 9 months ago

gauthierm commented 9 months ago

Checkboxes for prior research

Describe the bug

lib-storage's Upload provides a on('httpUploadProgress', progress) event for reporting progress. Using the default HTTP request handler (based on fetch), this does not provide fine-grained progress and only reports progress as each chunk is completed.

See also https://github.com/aws/aws-sdk-js-v3/issues/2206 See also https://github.com/aws/aws-sdk-js-v3/issues/3101

For huge files this is sort of ok (but not great), but for files in the 20 MB size range you only get a few events reported. If the upload is a bit slow, this means a long time between progress reports and huge swings in upload progress on each update. This is a poor user experience.

For the browser, AWS provides an alternative request handler using XHR (https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-xhr-http-handler/) that can do better progress reporting. For the server, the default https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-smithy-fetch-http-handler/ is used and cannot report progress between chunks.

On the server, the XHR handler cannot be used because Node.js doesn't provide the XHR API https://github.com/aws/aws-sdk-js-v3/issues/4473.

Adjusting the chunk size to be smaller than 5 MB could alleviate the issue on the server a little bit, but is not possible. https://github.com/aws/aws-sdk-js-v3/issues/4316

SDK version number

@aws-sdk/lib-storage@3.435.0

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v18.17.1

Reproduction Steps

// Stream file from URL to S3 bucket:
const response = await axios({
  method: 'GET',
  url: urlObject.toString(),
  responseType: 'stream',
  signal: controller.signal
});

const upload = new Upload({
  client: s3client,
  params: {
    ContentType: contentType,
    Body: stream,
    Bucket: bucket,
    Key: key,
    Metadata: metadata
  }
});

// progress is only reported after each chunk finishes
upload.on('httpUploadProgress', (progress) => {
  console.log(progress);
});

await upload.done();

Observed Behavior

Progress gets reported after each 5 MB chunk is received.

Expected Behavior

Progress gets reported on an interval, potentially configurable.

Possible Solution

Provide a request handler using Node's HTTP request APIs rather than Node fetch.

Additional Information/Context

No response

RanVaknin commented 9 months ago

Hi @gauthierm ,

Thanks for reaching out. I have converted this issue into a feature request since this is not a bug but the expected behavior.

I think this is a nice-to-have, but feature request prioritization is community driven. I'll keep this open and see if this gets more traction.

Thanks again, Ran~