Closed PlusA2M closed 9 months ago
@PlusA2M Some more information would be helpful in trying to understand what's happening here:
_deleted: true
records)If you'd prefer, you can share the schema privately by running this and posting the returned identifier here:
amplify diagnose --send-report
I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.
Are you making exactly the same calls over the network as you see DataStore make? If not, can you show what you did here? Was there any notable latency difference in the network requests?
@PlusA2M Some more information would be helpful in trying to understand what's happening here:
- The full schema
- How many records in each table? (Including
_deleted: true
records)- How many requests and records do you see come over the wire for each table? (Including deleted records)
- What's the latency like for each of those requests?
- What does your Amplify/DataStore configure steps look like? (Including sync expressions)
If you'd prefer, you can share the schema privately by running this and posting the returned identifier here:
amplify diagnose --send-report
I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.
Are you making exactly the same calls over the network as you see DataStore make? If not, can you show what you did here? Was there any notable latency difference in the network requests?
Hi @svidgen !
const syncExpressions = [getBookingSyncExpression(), syncExpression(Delivery, (d) =>
d?.id.eq('NEVER')
)]
if (!isConfigured) {
Amplify.configure({
...awsconfig,
DataStore: {
authModeStrategyType: AuthModeStrategyType.MULTI_AUTH,
syncExpressions,
errorHandler: function (error: unknown) {
console.error("Unrecoverable error", error)
}
}
})
Auth.configure({
ssr: true,
oauth: {
domain: '~~~.amazoncognito.com',
redirectSignIn: url.origin + '~~~',
redirectSignOut: url.origin + '~~~',
scope: ['openid', 'aws.cognito.signin.user.admin'],
responseType: 'code'
}
})
isConfigured = true
}
const ids = ['ID1', 'ID2', 'ID3', ...]
const from = unix of start of this year
const to = unix of later this year
function getBookingSyncExpression() {
return syncExpression(Booking, (b) => {
return b?.and((b2) => [
b2.startsAt.between(from, to),
b2.or((b3) => {
const filter = []
for (const id of ids) {
filter.push(b3.merchantID.eq(id))
}
return filter
})
])
})
}
Can you clarify points 3 and 4? Are you seeing ~6 requests per model? Or in total? Are you seeing long delays between requests? (Does it seem like DataStore is taking too long processing a result set before fetching the next page?)
Can you clarify points 3 and 4? Are you seeing ~6 requests per model? Or in total? Are you seeing long delays between requests? (Does it seem like DataStore is taking too long processing a result set before fetching the next page?)
The requests seems to be normal and quick. You may check on the screenshots below, thanks for you help!
I did a little testing myself, and these numbers don't seem too out of the ordinary for 8k records that are this wide. If you're able to add selective sync expressions to limit what you pull onto device, that could help. I'm curious about this statement though:
I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.
I must be overlooking something simple at the moment. But, since a good portion of the time seems to be the spent on the actual network requests in a combination of pulling the data and waiting for transfer, I'm pretty interested to see where you saved time doing this outside DataStore. Can you show me?
I did a little testing myself, and these numbers don't seem too out of the ordinary for 8k records that are this wide. If you're able to add selective sync expressions to limit what you pull onto device, that could help. I'm curious about this statement though:
I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.
I must be overlooking something simple at the moment. But, since a good portion of the time seems to be the spent on the actual network requests in a combination of pulling the data and waiting for transfer, I'm pretty interested to see where you saved time doing this outside DataStore. Can you show me?
Hi! Since the AppSync sync query is done fairly fast and acceptable (compared to the DataStore processing duration), the DataStore trigger 'ready' event after around ~15s or more, so what do I think is those ~15s is took by storing data to IndexedDB by DataStore. I've tried to save all the records through one key in IndexedDB, I know it might give up the performance of a document based db, but it is way faster in my case...Therefore I might suspect either syncing all the index of all records for all schema is making it slow, or saving it as key-value pair one record by one making it slow...?
Also please let me know if those AppSync sync query performed as it should be...? (tested from Hong Kong) Is there anything I can do to optimize it also?
Much appreciated for you help again!
There will indeed be some overhead involved with constructing, validating, and storing the records in IndexedDB. However, based on the request timeline you shared, the bulk of that 15s is seems like network + query time. I.e., each of the requests you showed us are around 1 to 2 seconds. And, since the requests for the large table are necessarily be executed in sequentially, the absolute lower limit for sync time on that one table alone will be "1 to 2 seconds" times "the number of pages" needed to sync the table.
In your case, your case, fully syncing 8k records in pages of 1k at a time will be 8 to 16 seconds in query time alone. This would be in addition to the sync time on your other tables.
I'm interested to know if you're able to pull all 8k records from that large table faster than this. If so, can you share details? The specific graphql requests, response timings, invoking code, and anything else you think is relevant.
A few things to explore in the meantime:
If your user doesn't need all 8k records from that table during a typical session, only sync the records down that are needed. E.g.,
import { DataStore, syncExpression } from 'aws-amplify';
// Perhaps some models are only of interest if they explicitly
// designate a topic the user cares about.
const TOPICS = ['cars', 'coffee', 'camping' ... ];
// Perhaps other models are only relevant if they've been created
// or updated "recently".
const RECENT = (() => {
const d = new Date();
d.setDate(d.getDate() - 7)
return d;
})();
// We can plug those conditions into DataStore's configuration to have
// it query for those specific records. This is more efficiently and cost
// effective if we can leverage an indexed field; but it's still advantageous
// if we can just limit the number of records sent over the wire.
DataStore.configure({
syncExpressions: [
syncExpression(SomeModel, () => {
return m => m.or(m => TOPICS.map(topic => m.topics.contains(topic))
}),
syncExpression(OtherModel, () => {
return m => m.updatedAt.gt(RECENT)
})
]
});
See: https://docs.amplify.aws/lib/datastore/sync/q/platform/js/#selectively-syncing-a-subset-of-your-data
It's worth noting you can also configure the number of records returned per page.
DataStore.configure({
syncPageSize: 2000 // default is 1k.
});
Given that some of your requests are already at the 1MB mark, this seems unlikely to help in this case. But, it's worth keeping in mind if you find your "narrower" tables become sluggish.
If you're dealing with high network latencies, larger pages may sync more quickly over all. If you tinker with this, just beware of hitting the
observeQuery
For applications with longer sync times, it can be beneficial to render results as they arrive over the wire.
const sub = DataStore
.observeQuery(SomeModel, m => m.field.eq('value'))
.subscribe(snapshot => {
const { items, isSynced } = snapshot;
// update the UI to show the items
setResults(items);
// and maybe an indicator to let your users know if
// all the results are in yet.
setIsFullySynced(isSynced);
});
See: https://docs.amplify.aws/lib/datastore/real-time/q/platform/js/#observe-query-results-in-real-time
Thanks for your help again! @svidgen
The items mentioned above had been applied in our system already, so we can keep syncing the most recent records. However, what I also encounter is that no matter base sync or delta sync, the duration more or less around ~30s in total with requests and DataStore process. Since we developing PWA using amplify-js, we don't think this is acceptable performance as an app (Instagram, X, GitHub etc. loads almost instantly no matter base or delta sync). Besides, we also facing other issues like DataStore sometimes only works on outbox but data itself is not syncing at all when user sleep/wake the PWA (caused by switching between apps), DataStore sometimes broke and just don't sync while no error is caught in either global error handler or error handler in observeQuery...etc. Those issues make us hard to develop a PWA with normal experience, even compared to normal website page with SSR calling GraphQL queries and mutations. All of the issues mentioned make our PWA's data inconsistent (DataStore sometimes broke a bit or a whole), performance extremely poor (acceptable for base sync but not delta sync), error handling is not working in some cases (we observe this behavior via service like LogRocket), those issues just make our app seems stupid...😨
Any further suggestions are greatly welcomed, we're eager to tryout. In the meantime, we might need to move on to other approaches for our project. 🥲
Hi @PlusA2M 👋 Unfortunately, I don't have a solution for the sync performance issue at this time. As @svidgen mentioned in his comments, the performance doesn't seem abnormal but there are several tools to reduce the amount of time it takes to sync data. Naturally, network latency will cause it to vary, but generally the most consistent approach is to only sync the data the user needs.
It is also entirely possible that DataStore may not be the right solution for your application's requirements. Unless one of those requirements is specifically offline capabilities, then we recommend migrating to the Amplify API library instead for the best balance of performance and reliability.
Besides the tools already mentioned, here are some others available to help mitigate sync times:
Delta tables store changes (or deltas) for a certain period. Over time, as more changes accumulate, the delta table grows in size. By adjusting the TTL, you can ensure that old deltas, which are no longer relevant, get automatically deleted. This keeps the delta table size manageable.
To adjust the TTL for the Base and Delta tables, you can use the Amplify CLI Override feature
For more info: https://docs.amplify.aws/cli/graphql/override/#customize-amplify-generated-resources-for-model-directive
Run amplify override api
and this will generate an override.ts
file in the amplify/backend/api/<apiName>
folder.
In that file, set the value for the Delta and/or Base table TTL:
import { AmplifyApiGraphQlResourceStackTemplate } from "@aws-amplify/cli-extensibility-helper";
export function override(resources: AmplifyApiGraphQlResourceStackTemplate) {
resources.models["Todo"].modelDatasource.dynamoDbConfig["deltaSyncConfig"][
"baseTableTTL"
] = "43200"; // TTL in minutes (default is 30 days)
resources.models["Todo"].modelDatasource.dynamoDbConfig["deltaSyncConfig"][
"deltaSyncTableTtl"
] = "30"; // TTL in minutes
}
Lastly, run amplify push
to deploy the changes.
If applicable, adding owner auth rules to the models makes it so DataStore automatically only syncs down the data relevant to and owned by the user
type Profile @model @auth(rules: [{ allow: owner }]) {
id: ID!
name: String!
address: String!
email: String!
}
For info on how to configure owner auth: https://docs.amplify.aws/cli/graphql/authorization-rules/#per-user--owner-based-data-access
Closing this issue as we have not heard back from you. If you are still experiencing this, please feel free to reply back and provide any information previously requested and we'd be happy to re-open the issue.
Thank you!
Before opening, please confirm:
JavaScript Framework
Not applicable
Amplify APIs
DataStore
Amplify Categories
api
Environment information
Describe the bug
Hi all! We're using
amplify-js
library over half-year, we're experiencing very slow performance for 5 models schema. We do already use selective sync to optimize the performance as needed. The performance issue appears on iOS Safari, iOS PWA, macOS Chrome, macOS Safari. I suspect it appears on all platform.How slow is it? Below records is tested on Chrome on macOS 13.4.1 on a 2019 MacBook Pro. Duration for
syncQueriesStarted
: ~10 seconds, no matter base or delta sync Largest model: Booking ~3,600 records w/ 40 attributes to sync, ~30 seconds on base sync, ~3 seconds on delta sync Other models: ~100 records, ~3 seconds on base sync, ~1 second on delta syncLevel of impact The
Booking
model do have heavily impact on user experience since our app is mainly serving people with booking service.What's the question? We do want to know:
I've tried to implement sync with only AppSync api and IndexedDB, the performance was pretty fast, got me no idea why do DataStore performance like that.
Expected behavior
Much faster base sync/delta sync.
Reproduction steps
Code Snippet
No response
Log output
No response
aws-exports.js
No response
Manual configuration
No response
Additional configuration
No response
Mobile Device
No response
Mobile Operating System
No response
Mobile Browser
No response
Mobile Browser Version
No response
Additional information and screenshots
No response