aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.41k stars 2.12k forks source link

DataStore synchronization issues with DataStore under poor network conditions #13127

Open schutzelaars opened 5 months ago

schutzelaars commented 5 months ago

Before opening, please confirm:

JavaScript Framework

Vue

Amplify APIs

DataStore

Amplify Version

v6

Amplify Categories

auth, api, hosting

Backend

Amplify CLI

Environment information

``` # Put output below this line System: OS: Windows 11 10.0.22631 CPU: (16) x64 AMD Ryzen 7 5800H with Radeon Graphics Memory: 2.85 GB / 13.87 GB Binaries: Node: 20.10.0 - C:\Program Files\nodejs\node.EXE Yarn: 1.22.19 - ~\AppData\Roaming\npm\yarn.CMD npm: 10.3.0 - C:\Program Files\nodejs\npm.CMD pnpm: 8.10.2 - ~\AppData\Local\pnpm\pnpm.EXE Browsers: Edge: Chromium (122.0.2365.80) Internet Explorer: 11.0.22621.1 npmPackages: aws-amplify: 6.0.20 => 6.0.20 aws-amplify/adapter-core: undefined () aws-amplify/analytics: undefined () aws-amplify/analytics/kinesis: undefined () aws-amplify/analytics/kinesis-firehose: undefined () aws-amplify/analytics/personalize: undefined () aws-amplify/analytics/pinpoint: undefined () aws-amplify/api: undefined () aws-amplify/api/server: undefined () aws-amplify/auth: undefined () aws-amplify/auth/cognito: undefined () aws-amplify/auth/cognito/server: undefined () aws-amplify/auth/enable-oauth-listener: undefined () aws-amplify/auth/server: undefined () aws-amplify/datastore: undefined () aws-amplify/in-app-messaging: undefined () aws-amplify/in-app-messaging/pinpoint: undefined () aws-amplify/push-notifications: undefined () aws-amplify/push-notifications/pinpoint: undefined () aws-amplify/storage: undefined () aws-amplify/storage/s3: undefined () aws-amplify/storage/s3/server: undefined () aws-amplify/storage/server: undefined () aws-amplify/utils: undefined () quasar: 2.15.0 => 2.15.0 vue: 3.4.21 => 3.4.21 (3.4.15) npmGlobalPackages: @aws-amplify/cli: 12.10.1 @quasar/cli: 2.3.0 node-gyp-build: 4.6.1 node-gyp: 9.4.0 npm: 10.3.0 pnpm: 8.15.1 simple-git: 3.22.0 typescript: 5.3.3 yarn: 1.22.19 ```

Describe the bug

After upgrading Amplify from version 4 to version 6, we encounter synchronization issues with DataStore under poor network conditions. Currently, this issue impacts some of our customers with very poor internet connections, which is why we chose to use DataStore. This issue did not occur when we used version 4 a couple of weeks ago.

The problem manifests when creating a local documents during network instability—simulated by throttling network speed, and switching to offline mode while the document is saved in DataStore. At this point the newly created document is succesfully inserted in IndexedDB.

The issue arises when a DataStore error is caught by the "errorHandler" in the DataStoreConfig: { operation: "Create", process: "mutate", remoteModel: null } Cause: { name: "GraphQLError", message: "Network error" }

Despite succesfull local document creation, DataStore fails to synchronize these documents once the network stabilizes, or when calling DataStore methods stop/start. The "OutboxStatusEvent" contains isEmpty: true, and the document is never in outboxMutationEvent and outboxMutationEvent event data.

I plan to conduct further testing later to determine if this issue also occurs with update mutations. Additionally, I'll see if updating the created document, which fails to sync, yields any result.

EDIT: The same error occurs when updating a document { operation: "Update", process: "mutate", remoteModel: null }, with the same cause.

Possibly related issues: #4131 #13035

Expected behavior

DataStore/IndexedDB documents that are successfully saved with _version: undefined should synchronize when the internet connection is good. And be found in outboxMutationEvent and outboxMutationEvent event data.

Reproduction steps

  1. Simulate poor internet connection. Throttle your internet speed.
  2. Use DataStore to create/update a document.
  3. Right After triggering a DataStore document creation/update, disconnect the internet connection.
  4. A "Network error" should occur.
  5. After reconnecting to a stable network, verify if the saved document is picked up by DataStore's sync outbox.

EDIT: To make Step 3 easier to reproduce, extend the error window by throttling the CPU speed (for example, a 6x slowdown).

The screenshot added of warning is the error that would be caught by the "errorHandler" in the DataStoreConfig

Code Snippet

const config: DataStoreConfig = {
  syncPageSize: 4_000,
  maxRecordsToSync: 100_000, //big num
  fullSyncInterval: 60 * 24, // minutes
  errorHandler: syncErrorsHandler,
  syncExpressions: getSyncExpressions(),
  conflictHandler: syncConflictHandler,
};

async function create<T extends Model>(
  model: ModelConstructor<T>,
  attrs: Omit<ModelCreateInput<T>, ModelMetaKey<T> | ModelAuthKey<T>> & {
    id?: string;
    userId?: string;
    organisationId?: string;
  }
) {
  return DataStore.save(
    new model({
      id: attrs?.id ?? uuid(),
      organisationId: attrs?.organisationId ?? getOrganisationId(),
      userId: attrs?.userId ?? getUserId(),
      ...attrs,
    })
  );
}

Log output

``` // Put your logs below this line ```

aws-exports.js


const awsmobile = {
    "aws_project_region": "xx-xx-xx",
    "aws_appsync_graphqlEndpoint": "https://xxxxxxxxxxxxx.amazonaws.com/graphql",
    "aws_appsync_region": "xx-xx-xx",
    "aws_appsync_authenticationType": "AMAZON_COGNITO_USER_POOLS",
    "aws_appsync_apiKey": "xx-xxxxxxxxxx",
    "aws_cognito_identity_pool_id": "xx-xx-xx:xxxxxxxxxxxxxxxxxxxx",
    "aws_cognito_region": "xx-xx-xx",
    "aws_user_pools_id": "xx-xx-xx_xxxxxx",
    "aws_user_pools_web_client_id": "xxxxxxxxxxxxxx",
    "oauth": {},
    "aws_cognito_username_attributes": [
        "EMAIL"
    ],
    "aws_cognito_social_providers": [],
    "aws_cognito_signup_attributes": [
        "EMAIL"
    ],
    "aws_cognito_mfa_configuration": "OFF",
    "aws_cognito_mfa_types": [
        "SMS"
    ],
    "aws_cognito_password_protection_settings": {
        "passwordPolicyMinLength": 8,
        "passwordPolicyCharacters": []
    },
    "aws_cognito_verification_mechanisms": [
        "EMAIL"
    ],
    "aws_user_files_s3_bucket": "xxxxxxxxxxxxxxxxxxxxxxx",
    "aws_user_files_s3_bucket_region": "xx-xx-xx"
};

### Manual configuration

_No response_

### Additional configuration

_No response_

### Mobile Device

_No response_

### Mobile Operating System

_No response_

### Mobile Browser

_No response_

### Mobile Browser Version

_No response_

### Additional information and screenshots

![image](https://github.com/aws-amplify/amplify-js/assets/76615135/c5b3c4b3-4406-48e8-8bc8-69cae23a03c4)
![image](https://github.com/aws-amplify/amplify-js/assets/76615135/8d5fa520-a04e-440f-a022-a4e5b52c4b6f)

EDIT:
![image](https://github.com/aws-amplify/amplify-js/assets/76615135/a788dbce-9a04-41f9-82ea-9c5ee996381f)
schutzelaars commented 5 months ago

@chrisbonifacio @cwomack, I am tagging you as discussed in yesterday's 'Office Hours'

cwomack commented 5 months ago

@schutzelaars, appreciate you creating this issue from the Discord Office Hours discussion! Just to clarify one detail, how much do you throttle the network (i.e. slow/fast 3G or another speed) before you experience this issue? And is it reproducible every time or somewhat inconsistent? Finally, any frontend code that seems to be associated with this would be helpful to see if you can share it. Thanks!

schutzelaars commented 5 months ago

@cwomack Yesterday, I reproduced the issue 5 out of 5 times using slow 3G speed. The code is fairly standard. I've updated the issue with some code examples, though I'm not sure if they will be helpful. The only changes to this code were syntax updates when we migrated from Amplify v4 to v6.

schutzelaars commented 5 months ago

@cwomack Can I do anything to speed things up? Alternatively, could you direct me towards potential solutions or methods to more accurately identify the problem's cause? It's currently affecting a number of our customers.

chrisbonifacio commented 5 months ago

Hi @schutzelaars Thank you for providing repro steps. We will try to reproduce the issue internally and get back to you with an update once we identify the root cause and/or have a solution.

schutzelaars commented 5 months ago

@chrisbonifacio Today, I was able to reproduce it with an "update" mutation; the same problem/error occurs.

schutzelaars commented 5 months ago

@chrisbonifacio @cwomack I forgot to mention that you can extend the error window by throttling the CPU speed (for example, a 6x slowdown). This makes it easier to reproduce Step 3. Hope this helps!

schutzelaars commented 4 months ago

Has anyone been able to reproduce the issue?