aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.42k stars 2.12k forks source link

Extremely Slow Performance on DataStore (>30s sync) #11668

Closed PlusA2M closed 9 months ago

PlusA2M commented 1 year ago

Before opening, please confirm:

JavaScript Framework

Not applicable

Amplify APIs

DataStore

Amplify Categories

api

Environment information

System: OS: macOS 12.6 CPU: (8) arm64 Apple M1 Memory: 142.64 MB / 8.00 GB Shell: 5.8.1 - /bin/zsh Binaries: Node: 16.17.1 - ~/.nvm/versions/node/v16.17.1/bin/node Yarn: 1.22.19 - ~/.nvm/versions/node/v16.17.1/bin/yarn npm: 8.19.2 - ~/.nvm/versions/node/v16.17.1/bin/npm Browsers: Chrome: 115.0.5790.102 Safari: 16.0 npmPackages: @macfja/svelte-persistent-store: ^2.3.0 => 2.3.0 @sveltejs/adapter-auto: ^2.1.0 => 2.1.0 @sveltejs/kit: ^1.20.2 => 1.21.0 (1.8.3) @types/dom-screen-wake-lock: ^1.0.1 => 1.0.1 @types/file-saver: ^2.0.5 => 2.0.5 @types/google-libphonenumber: ^7.4.23 => 7.4.23 @types/gtag.js: ^0.0.12 => 0.0.12 @typescript-eslint/eslint-plugin: ^5.59.9 => 5.60.1 @typescript-eslint/parser: ^5.59.9 => 5.60.1 @vite-pwa/sveltekit: ^0.2.5 => 0.2.5 autoprefixer: ^10.4.14 => 10.4.14 aws-amplify: ^5.2.5 => 5.3.3 chart.js: ^4.3.0 => 4.3.0 chart.js-auto: undefined () chart.js-helpers: undefined () clipboard: ^2.0.11 => 2.0.11 compressorjs: ^1.2.1 => 1.2.1 dayjs: ^1.11.8 => 1.11.8 eslint: ^8.42.0 => 8.43.0 eslint-config-prettier: ^8.8.0 => 8.8.0 eslint-plugin-svelte3: ^4.0.0 => 4.0.0 exceljs: ^4.3.0 => 4.3.0 file-saver: ^2.0.5 => 2.0.5 google-libphonenumber: ^3.2.32 => 3.2.32 ics: ^3.1.0 => 3.2.0 jszip: ^3.10.1 => 3.10.1 logrocket: ^4.0.3 => 4.0.3 postcss: ^8.4.24 => 8.4.24 postcss-load-config: ^4.0.1 => 4.0.1 prettier: ^2.8.8 => 2.8.8 prettier-plugin-svelte: ^2.10.1 => 2.10.1 prettier-plugin-tailwindcss: ^0.3.0 => 0.3.0 simple-svelte-autocomplete: ^2.5.2 => 2.5.2 svelte: ^3.59.1 => 3.59.2 (3.55.1) svelte-check: ^3.4.3 => 3.4.4 svelte-preprocess: ^5.0.4 => 5.0.4 sveltekit-adapter-aws: ^4.5.1 => 4.5.1 tailwindcss: ^3.3.2 => 3.3.2 tslib: ^2.5.3 => 2.6.0 (1.14.1) typescript: ^5.1.3 => 5.1.6 vite: ^4.3.9 => 4.3.9 (4.1.4) vite-plugin-mkcert: ^1.15.0 => 1.16.0 workbox-window: ^7.0.0 => 7.0.0 npmGlobalPackages: @aws-amplify/cli: 10.7.1 aws-cdk: 2.70.0 corepack: 0.12.1 npm: 8.19.2 yarn: 1.22.19

Describe the bug

Hi all! We're using amplify-js library over half-year, we're experiencing very slow performance for 5 models schema. We do already use selective sync to optimize the performance as needed. The performance issue appears on iOS Safari, iOS PWA, macOS Chrome, macOS Safari. I suspect it appears on all platform.

How slow is it? Below records is tested on Chrome on macOS 13.4.1 on a 2019 MacBook Pro. Duration for syncQueriesStarted: ~10 seconds, no matter base or delta sync Largest model: Booking ~3,600 records w/ 40 attributes to sync, ~30 seconds on base sync, ~3 seconds on delta sync Other models: ~100 records, ~3 seconds on base sync, ~1 second on delta sync

Level of impact The Booking model do have heavily impact on user experience since our app is mainly serving people with booking service.

What's the question? We do want to know:

  1. Is this expected performance?
  2. If not, how to improve? Is there anything we need to do on DynamoDB schema design?

I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.

Expected behavior

Much faster base sync/delta sync.

Reproduction steps

  1. Have 5 models schema
  2. One of the with 40 attributes, > 8,000 records
  3. Others within ~100 records

Code Snippet

No response

Log output

No response

aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

No response

svidgen commented 1 year ago

@PlusA2M Some more information would be helpful in trying to understand what's happening here:

  1. The full schema
  2. How many records in each table? (Including _deleted: true records)
  3. How many requests and records do you see come over the wire for each table? (Including deleted records)
  4. What's the latency like for each of those requests?
  5. What does your Amplify/DataStore configure steps look like? (Including sync expressions)

If you'd prefer, you can share the schema privately by running this and posting the returned identifier here:

amplify diagnose --send-report

I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.

Are you making exactly the same calls over the network as you see DataStore make? If not, can you show what you did here? Was there any notable latency difference in the network requests?

PlusA2M commented 1 year ago

@PlusA2M Some more information would be helpful in trying to understand what's happening here:

  1. The full schema
  2. How many records in each table? (Including _deleted: true records)
  3. How many requests and records do you see come over the wire for each table? (Including deleted records)
  4. What's the latency like for each of those requests?
  5. What does your Amplify/DataStore configure steps look like? (Including sync expressions)

If you'd prefer, you can share the schema privately by running this and posting the returned identifier here:

amplify diagnose --send-report

I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.

Are you making exactly the same calls over the network as you see DataStore make? If not, can you show what you did here? Was there any notable latency difference in the network requests?

Hi @svidgen !

  1. Please check the project identifier: 42790c84c91edfe8ca7ee655c9913e2a
  2. Largest model: Booking ~8,000 records w/ 40 attributes, Other models: ~100 records
  3. The requests are normally performed
  4. Same as above, full sync might need 6~7 requests, all response is very fast, ~500ms
  5. Please see below:
    const syncExpressions = [getBookingSyncExpression(),  syncExpression(Delivery, (d) =>
        d?.id.eq('NEVER')
    )]
    if (!isConfigured) {
        Amplify.configure({
            ...awsconfig,
            DataStore: {
                authModeStrategyType: AuthModeStrategyType.MULTI_AUTH,
                syncExpressions,
                errorHandler: function (error: unknown) {
                    console.error("Unrecoverable error", error)
                }
            }
        })
        Auth.configure({
            ssr: true,
            oauth: {
                domain: '~~~.amazoncognito.com',
                redirectSignIn: url.origin + '~~~',
                redirectSignOut: url.origin + '~~~',
                scope: ['openid', 'aws.cognito.signin.user.admin'],
                responseType: 'code'
            }
        })
        isConfigured = true
    }
const ids = ['ID1', 'ID2', 'ID3', ...]
const from = unix of start of this year
const to = unix of later this year
function getBookingSyncExpression() {
    return syncExpression(Booking, (b) => {
        return b?.and((b2) => [
            b2.startsAt.between(from, to),
            b2.or((b3) => {
                const filter = []
                for (const id of ids) {
                    filter.push(b3.merchantID.eq(id))
                }
                return filter
            })
        ])
    })
}
svidgen commented 1 year ago

Can you clarify points 3 and 4? Are you seeing ~6 requests per model? Or in total? Are you seeing long delays between requests? (Does it seem like DataStore is taking too long processing a result set before fetching the next page?)

PlusA2M commented 1 year ago

Can you clarify points 3 and 4? Are you seeing ~6 requests per model? Or in total? Are you seeing long delays between requests? (Does it seem like DataStore is taking too long processing a result set before fetching the next page?)

The requests seems to be normal and quick. You may check on the screenshots below, thanks for you help! image image

svidgen commented 1 year ago

I did a little testing myself, and these numbers don't seem too out of the ordinary for 8k records that are this wide. If you're able to add selective sync expressions to limit what you pull onto device, that could help. I'm curious about this statement though:

I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.

I must be overlooking something simple at the moment. But, since a good portion of the time seems to be the spent on the actual network requests in a combination of pulling the data and waiting for transfer, I'm pretty interested to see where you saved time doing this outside DataStore. Can you show me?

PlusA2M commented 1 year ago

I did a little testing myself, and these numbers don't seem too out of the ordinary for 8k records that are this wide. If you're able to add selective sync expressions to limit what you pull onto device, that could help. I'm curious about this statement though:

I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.

I must be overlooking something simple at the moment. But, since a good portion of the time seems to be the spent on the actual network requests in a combination of pulling the data and waiting for transfer, I'm pretty interested to see where you saved time doing this outside DataStore. Can you show me?

Hi! Since the AppSync sync query is done fairly fast and acceptable (compared to the DataStore processing duration), the DataStore trigger 'ready' event after around ~15s or more, so what do I think is those ~15s is took by storing data to IndexedDB by DataStore. I've tried to save all the records through one key in IndexedDB, I know it might give up the performance of a document based db, but it is way faster in my case...Therefore I might suspect either syncing all the index of all records for all schema is making it slow, or saving it as key-value pair one record by one making it slow...?

Also please let me know if those AppSync sync query performed as it should be...? (tested from Hong Kong) Is there anything I can do to optimize it also?

Much appreciated for you help again!

svidgen commented 1 year ago

There will indeed be some overhead involved with constructing, validating, and storing the records in IndexedDB. However, based on the request timeline you shared, the bulk of that 15s is seems like network + query time. I.e., each of the requests you showed us are around 1 to 2 seconds. And, since the requests for the large table are necessarily be executed in sequentially, the absolute lower limit for sync time on that one table alone will be "1 to 2 seconds" times "the number of pages" needed to sync the table.

In your case, your case, fully syncing 8k records in pages of 1k at a time will be 8 to 16 seconds in query time alone. This would be in addition to the sync time on your other tables.

I'm interested to know if you're able to pull all 8k records from that large table faster than this. If so, can you share details? The specific graphql requests, response timings, invoking code, and anything else you think is relevant.

A few things to explore in the meantime:

Selective sync

If your user doesn't need all 8k records from that table during a typical session, only sync the records down that are needed. E.g.,

import { DataStore, syncExpression } from 'aws-amplify';

// Perhaps some models are only of interest if they explicitly
// designate a topic the user cares about.
const TOPICS = ['cars', 'coffee', 'camping' ... ];

// Perhaps other models are only relevant if they've been created
// or updated "recently".
const RECENT = (() => {
    const d = new Date();
    d.setDate(d.getDate() - 7)
    return d;
})();

// We can plug those conditions into DataStore's configuration to have
// it query for those specific records. This is more efficiently and cost
// effective if we can leverage an indexed field; but it's still advantageous
// if we can just limit the number of records sent over the wire.
DataStore.configure({
  syncExpressions: [
    syncExpression(SomeModel, () => {
      return m => m.or(m => TOPICS.map(topic => m.topics.contains(topic))
    }),
    syncExpression(OtherModel, () => {
      return m => m.updatedAt.gt(RECENT)
    })
  ]
});

See: https://docs.amplify.aws/lib/datastore/sync/q/platform/js/#selectively-syncing-a-subset-of-your-data

Sync page size

It's worth noting you can also configure the number of records returned per page.

DataStore.configure({
  syncPageSize: 2000    // default is 1k.
});

Given that some of your requests are already at the 1MB mark, this seems unlikely to help in this case. But, it's worth keeping in mind if you find your "narrower" tables become sluggish.

If you're dealing with high network latencies, larger pages may sync more quickly over all. If you tinker with this, just beware of hitting the

observeQuery

For applications with longer sync times, it can be beneficial to render results as they arrive over the wire.

const sub = DataStore
  .observeQuery(SomeModel, m => m.field.eq('value'))
  .subscribe(snapshot => {
    const { items, isSynced } = snapshot;

    // update the UI to show the items
    setResults(items);

    // and maybe an indicator to let your users know if
    // all the results are in yet.
    setIsFullySynced(isSynced);
  });

See: https://docs.amplify.aws/lib/datastore/real-time/q/platform/js/#observe-query-results-in-real-time

PlusA2M commented 1 year ago

Thanks for your help again! @svidgen

The items mentioned above had been applied in our system already, so we can keep syncing the most recent records. However, what I also encounter is that no matter base sync or delta sync, the duration more or less around ~30s in total with requests and DataStore process. Since we developing PWA using amplify-js, we don't think this is acceptable performance as an app (Instagram, X, GitHub etc. loads almost instantly no matter base or delta sync). Besides, we also facing other issues like DataStore sometimes only works on outbox but data itself is not syncing at all when user sleep/wake the PWA (caused by switching between apps), DataStore sometimes broke and just don't sync while no error is caught in either global error handler or error handler in observeQuery...etc. Those issues make us hard to develop a PWA with normal experience, even compared to normal website page with SSR calling GraphQL queries and mutations. All of the issues mentioned make our PWA's data inconsistent (DataStore sometimes broke a bit or a whole), performance extremely poor (acceptable for base sync but not delta sync), error handling is not working in some cases (we observe this behavior via service like LogRocket), those issues just make our app seems stupid...😨

Any further suggestions are greatly welcomed, we're eager to tryout. In the meantime, we might need to move on to other approaches for our project. 🥲

chrisbonifacio commented 11 months ago

Hi @PlusA2M 👋 Unfortunately, I don't have a solution for the sync performance issue at this time. As @svidgen mentioned in his comments, the performance doesn't seem abnormal but there are several tools to reduce the amount of time it takes to sync data. Naturally, network latency will cause it to vary, but generally the most consistent approach is to only sync the data the user needs.

NOTE

It is also entirely possible that DataStore may not be the right solution for your application's requirements. Unless one of those requirements is specifically offline capabilities, then we recommend migrating to the Amplify API library instead for the best balance of performance and reliability.

Besides the tools already mentioned, here are some others available to help mitigate sync times:

Adjusting the Delta Table TTL

Delta tables store changes (or deltas) for a certain period. Over time, as more changes accumulate, the delta table grows in size. By adjusting the TTL, you can ensure that old deltas, which are no longer relevant, get automatically deleted. This keeps the delta table size manageable.

To adjust the TTL for the Base and Delta tables, you can use the Amplify CLI Override feature

For more info: https://docs.amplify.aws/cli/graphql/override/#customize-amplify-generated-resources-for-model-directive

Run amplify override api and this will generate an override.ts file in the amplify/backend/api/<apiName> folder.

In that file, set the value for the Delta and/or Base table TTL:

import { AmplifyApiGraphQlResourceStackTemplate } from "@aws-amplify/cli-extensibility-helper";

export function override(resources: AmplifyApiGraphQlResourceStackTemplate) {
  resources.models["Todo"].modelDatasource.dynamoDbConfig["deltaSyncConfig"][
    "baseTableTTL"
  ] = "43200"; // TTL in minutes (default is 30 days)

  resources.models["Todo"].modelDatasource.dynamoDbConfig["deltaSyncConfig"][
    "deltaSyncTableTtl"
  ] = "30"; // TTL in minutes
}

Lastly, run amplify push to deploy the changes.

Owner Auth

If applicable, adding owner auth rules to the models makes it so DataStore automatically only syncs down the data relevant to and owned by the user

type Profile @model @auth(rules: [{ allow: owner }]) {
   id: ID!
   name: String!
   address: String!
   email: String!
}

For info on how to configure owner auth: https://docs.amplify.aws/cli/graphql/authorization-rules/#per-user--owner-based-data-access

cwomack commented 9 months ago

Closing this issue as we have not heard back from you. If you are still experiencing this, please feel free to reply back and provide any information previously requested and we'd be happy to re-open the issue.

Thank you!