aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.41k stars 2.11k forks source link

Storage.put Quietly Failing to Upload Images #13577

Closed matt-at-allera closed 4 weeks ago

matt-at-allera commented 1 month ago

Before opening, please confirm:

JavaScript Framework

React

Amplify APIs

Storage

Amplify Version

v5

Amplify Categories

auth, storage, function, api, hosting

Backend

Amplify CLI

Environment information

``` # Put output below this line System: OS: macOS 13.2.1 CPU: (8) arm64 Apple M1 Memory: 117.73 MB / 8.00 GB Shell: 5.8.1 - /bin/zsh Binaries: Node: 20.7.0 - ~/.nvm/versions/node/v20.7.0/bin/node npm: 10.1.0 - ~/.nvm/versions/node/v20.7.0/bin/npm Browsers: Chrome: 126.0.6478.127 Safari: 16.3 npmPackages: @aws-amplify/core: ^5.8.2 => 5.8.12 @aws-amplify/core/internals/aws-client-utils: undefined () @aws-amplify/core/internals/aws-client-utils/composers: undefined () @aws-amplify/core/internals/aws-clients/pinpoint: undefined () @aws-amplify/ui-react: ^5.1.1 => 5.3.3 @aws-amplify/ui-react-internal: undefined () @aws-amplify/ui-react-storage: ^2.1.4 => 2.3.3 @aws-sdk/client-cognito-identity-provider: ^3.515.0 => 3.569.0 @aws-sdk/client-ses: ^3.515.0 => 3.569.0 @aws-sdk/lib-dynamodb: ^3.515.0 => 3.569.0 @aws-sdk/util-utf8-browser: ^3.259.0 => 3.259.0 (3.6.1, 3.186.0) @babel/cli: ^7.22.10 => 7.24.5 @babel/core: ^7.22.10 => 7.24.5 @babel/plugin-proposal-private-property-in-object: ^7.21.11 => 7.21.11 (7.21.0-placeholder-for-preset-env.2) @babel/preset-env: ^7.22.10 => 7.24.5 @babel/preset-typescript: ^7.24.1 => 7.24.1 @cyntler/react-doc-viewer: ^1.14.0 => 1.14.1 @cypress/angular: 0.0.0-development @cypress/mount-utils: 0.0.0-development @cypress/react: 0.0.0-development @cypress/react18: 0.0.0-development @cypress/svelte: 0.0.0-development @cypress/vue: 0.0.0-development @cypress/vue2: 0.0.0-development @emotion/react: ^11.10.6 => 11.11.4 @emotion/styled: ^11.10.6 => 11.11.5 @mui/lab: ^5.0.0-alpha.124 => 5.0.0-alpha.170 @mui/material: ^5.11.15 => 5.15.16 @newrelic/browser-agent: ^1.260.0 => 1.260.0 @testing-library/jest-dom: ^5.16.5 => 5.17.0 @testing-library/react: ^13.4.0 => 13.4.0 @testing-library/user-event: ^13.5.0 => 13.5.0 @types/jest: ^27.5.2 => 27.5.2 @types/node: ^17.0.45 => 17.0.45 (18.19.32) @types/react: ^18.2.46 => 18.3.1 @types/react-dom: ^18.2.18 => 18.3.0 @types/react-router-dom: ^5.3.3 => 5.3.3 amazon-quicksight-embedding-sdk: ^2.6.0 => 2.7.0 aws-amplify: ^5.3.12 => 5.3.18 aws-sdk: ^2.1473.0 => 2.1615.0 aws-sdk-client-mock: ^3.0.1 => 3.1.0 axios: ^1.6.7 => 1.6.8 buffer: ^6.0.3 => 6.0.3 (4.9.2, 5.7.1) chart.js: ^4.4.3 => 4.4.3 chart.js-auto: undefined () chart.js-helpers: undefined () chartjs-plugin-datalabels: ^2.2.0 => 2.2.0 cypress: ^13.6.0 => 13.8.1 eslint-plugin-cypress: ^3.3.0 => 3.3.0 graphql: ^16.8.1 => 16.8.1 (15.8.0) html2canvas: ^1.4.1 => 1.4.1 idb: ^8.0.0 => 8.0.0 (5.0.6, 7.1.1) jest: ^29.0.0 => 29.7.0 (27.5.1) jspdf: ^2.5.1 => 2.5.1 jszip: ^3.10.1 => 3.10.1 mammoth: ^1.6.0 => 1.7.2 path-browserify: ^1.0.1 => 1.0.1 prettier: ^3.2.5 => 3.2.5 process: ^0.11.10 => 0.11.10 qrcode.react: ^3.1.0 => 3.1.0 react: ^18.2.0 => 18.3.1 (18.2.0) react-app-rewired: ^2.2.1 => 2.2.1 react-beautiful-dnd: ^13.1.1 => 13.1.1 react-chartjs-2: ^5.2.0 => 5.2.0 react-csv: ^2.2.2 => 2.2.2 react-dom: ^18.2.0 => 18.3.1 react-router-dom: ^6.24.0 => 6.24.0 react-scripts: ^5.0.1 => 5.0.1 react-select: ^5.8.0 => 5.8.0 react-signature-canvas: ^1.0.6 => 1.0.6 sass: ^1.71.1 => 1.77.0 source-map-loader: ^4.0.1 => 4.0.2 (3.0.2) swr: ^2.0.4 => 2.2.5 ts-jest: ^29.1.1 => 29.1.2 ts-loader: ^9.5.1 => 9.5.1 tsconfig-paths-webpack-plugin: ^4.1.0 => 4.1.0 typescript: ^4.9.5 => 4.9.5 web-vitals: ^2.1.4 => 2.1.4 (3.5.2) workbox-background-sync: ^6.6.0 => 6.6.0 (6.6.1) workbox-broadcast-update: ^6.6.0 => 6.6.0 workbox-cacheable-response: ^6.6.0 => 6.6.0 workbox-core: ^6.6.0 => 6.6.0 (6.6.1) workbox-expiration: ^6.6.0 => 6.6.0 workbox-google-analytics: ^6.6.1 => 6.6.1 (6.6.0) workbox-navigation-preload: ^6.6.0 => 6.6.0 workbox-precaching: ^6.6.0 => 6.6.0 workbox-range-requests: ^6.6.0 => 6.6.0 workbox-routing: ^6.6.0 => 6.6.0 (6.6.1) workbox-strategies: ^6.6.0 => 6.6.0 (6.6.1) workbox-streams: ^6.6.0 => 6.6.0 xlsx: ^0.18.5 => 0.18.5 npmGlobalPackages: @aws-amplify/cli: 12.11.1 @newrelic/publish-sourcemap: 5.1.0 aws-cdk: 2.127.0 corepack: 0.19.0 newrelic: 11.19.0 npm: 10.1.0 serve: 14.2.3 snyk: 1.1241.0 typescript: 5.2.2 ```

Describe the bug

Storage.put seemingly does not always complete the file upload process even if it returns the s3 file path as a key.

For the past week or so, when uploading images via Storage.put, we are seeing increased frequency of the image never making it to S3. The result of await Storage.put(... (example below) does output a key (s3 file path) as expected, but if I go to that location in our specified S3 bucket, the image isn't actually there. The block of code that invokes Storage.put is surrounded in a try/catch, so no error is thrown.

When accessing the missing image via Storage.get, the request throws a 403: Forbidden. I am presuming this is because there is not file to be accessed there.

Expected behavior

I would expect that if the Storage.put command returns a valid key and does not throw an error, that the image files will always be present in S3 when searched for.

Reproduction steps

Exact reproduction is difficult

  1. Capture an image on device and store the source url for it
  2. Upload the image via Storage.put(...
  3. Store the resulting key (s3 file path) for later usage
  4. Pull the lookup entry that has the s3 file path saved
  5. Fetch the image via Storage.get(...
  6. Encounter 403 for when the image is missing

Code Snippet

// Put your code below this line.
  let result = await Storage.put(filePath, file, {
    level: 'public',
    contentType: 'image/png',
  });

  fileKey = result.key;

  const fileLookupObj = {
    s3FilePath: fileKey,
    ...
  };

  // store fileLookupObj for later retrieval

Log output

``` // Put your logs below this line ```

aws-exports.js

No response

Manual configuration

Storage.configure({
  // we only allow writing files on the `public` permissions scheme
  // access is controlled via the auth IAM role
  customPrefix: {
    public: '',
  },
  track: true,
});

Additional configuration

No response

Mobile Device

IPad

Mobile Operating System

No response

Mobile Browser

Chrome

Mobile Browser Version

No response

Additional information and screenshots

No response

ashwinkumar6 commented 1 month ago

Hi @matt-at-allera Thanks for raising the issue, apologies for the inconvenience the library has caused.

Below are a few questions to help us troubleshoot the issue

  1. When Storage.put fails with no error thrown are there any network errors observed ?
  2. Does the issue happen intermittently ? Is it possible the failed Storage.put requests might be fired when the device is offline ?
  3. Could you elaborate on the file sizes that are being uploaded ?
  4. Is the issue occurring in all devices or would it be specific to Mobile (example browser on IPad) as mentioned on the issue
matt-at-allera commented 1 month ago

Hi @matt-at-allera Thanks for raising the issue, apologies for the inconvenience the library has caused.

Below are a few questions to help us troubleshoot the issue

  1. When Storage.put fails with no error thrown are there any network errors observed ?
  2. Does the issue happen intermittently ? Is it possible the failed Storage.put requests might be fired when the device is offline ?
  3. Could you elaborate on the file sizes that are being uploaded ?
  4. Is the issue occurring in all devices or would it be specific to Mobile (example browser on IPad) as mentioned on the issue

Hi @ashwinkumar6, thanks for the response & great questions.

  1. It's possible. The difficulty in observation is that we haven't been able to replicate the issue on our end successfully yet. I am working today to continue to do so. It's reported by a single customer who makes heavy use of images.
  2. The issue is intermittently, and it appears to affect specific devices for lengthy periods of time. We do support an offline-mode that seems a likely culprit to causing this issue. The offline-mode is supported by a service worker which maintains a status of the network state. If the device is offline from the service-worker's perspective, it will block it with a 503. I would assume that if this is the case, the Storage.put would fail / throw an error from the 503. Is that not the case? Does storage.put upload asynchronously?
  3. File sizes (images) are typically between 2-10mb.
  4. The issue is only being reported on IPad w/ a chrome browser.

Another finding that we encountered last night is that there were a few missing images in s3. We came back to look at the images again, and they were present after an hour or so. This could mean that they are unintentionally stuck in queue via the offline-mode or the Storage module had serious delays in uploading the file. Could there be backoff issues at play?

ashwinkumar6 commented 1 month ago

Hi @matt-at-allera Thanks for the detailed response !!

If the device is offline from the service-worker's perspective, it will block it with a 503. I would assume that if this is the case, the Storage.put would fail / throw an error from the 503. Is that not the case? Does storage.put upload asynchronously?

In this case when the device is offline would the service worker still make the Storage.put API calls. In case this API is called i'm wondering if we can extract the network errors. The storage module does not have a retry mechanism which would try to re-upload incase it fails. So Storage.put should be generating network errors when the device is offline.

We came back to look at the images again, and they were present after an hour or so. This could mean that they are unintentionally stuck in queue via the offline-mode or the Storage module had serious delays in uploading the file.

Does the service worker use any Background Synchronization API to handle tasks in offline mode. I'm wondering if this is eventually firing any PutObject service network calls (the underlying service call Storage.put makes) once the device comes back online.

Does this issue occur outside the use-case of a service worker? It would be pretty helpful if you could also provide a sample app with this setup to help reproduce. Could you also try upgrading to V6 if possible and check if you're still facing this issue?

cwomack commented 1 month ago

@matt-at-allera, wanted to ping you and see if you had a chance to review @ashwinkumar6's comment above. Let us know if you're still blocked by this! Thanks.