Open guy-a opened 2 years ago
Also, in the Amplify DataStore best practices, under "Sync configuration with base and delta queries" a "delta query" is being mentioned. I could not find any more info regarding that matter. What is the meaning of this "delta query"?
Hi @guy-a 👋 thanks for raising this issue. I will reach out to the data team and see if this is expected behavior and follow up with some feedback.
By "delta query", we mean the graphql query DataStore performs to pull any changes that were missed from a Delta Table since the last sync.
From the docs:
The Sync Engine will run a GraphQL query on first start that hydrates the Storage Engine from the network using a Base Query. This defaults to a limit of 100 items at a time and will paginate through up to 1000 items. It will then store a Last Sync Time and each time the device goes from an offline to online state, it will use this as an argument in a Delta Query. When AppSync receives this Last Sync Time in its argument list it will only returned the changes that have been missed by pulling items in a Delta Table.
Hi @chrisbonifacio, thanks for your reply. I'm looking forward to your feedback.
Which part of my bug is a feature request? Is it not possible for DataStore to use a query? Because that's not what I've read else were i.e.
I've just did some more tests on different regions and the result is the same a query and a scan are being used together.
Btw, I did multiple tests, and read all the relevant issues before opening the bug.
It’s possible that this is partially the result of a regression bug from when you switched to GraphQL Transformer V2. I have an older amplify app in production which is still using the old transformers and it’s doing a query for the base sync. All other issues are the same.
Assuming I want to continue to use DataStore, but without any scans on the DDB, are pipeline resolvers the way to go?
Saw your update to the generated VTL resolvers in the new amplify cli v 10.0.0, but I don’t think it’s good enough for this specific issue. The lastSync in the VTL is indeed one of the problematic areas, but this fix won’t solve the problem IMHO. (Also, hardcoding deltaTTL to 30 minutes might be problematic) In all of my generated VTL resolvers, there are always 2 sync resolvers that are fetching from the DB, 2 for the same model: Query.syncUsers.preAuth.1.req.vtl - will sometimes do a “query” and sometimes a “scan” Query.syncUsers.req.vtl - will always do a “scan”, irrespectively of what happened before it in the pipeline
That’s why I am seeing query+scan or scan+scan.
JavaScript Framework
React
Amplify APIs
DataStore
Amplify Categories
api
Environment information
System: OS: macOS 12.5 Shell: 5.8.1 - /bin/zsh Binaries: Node: 14.19.1 - /usr/local/bin/node npm: 6.14.16 - /usr/local/bin/npm Browsers: Chrome: 103.0.5060.134 Safari: 15.6 npmPackages: @aws-amplify/ui-react: ^3.3.0 => 3.3.0 @aws-amplify/ui-react-internal: undefined () @aws-amplify/ui-react-legacy: undefined () @testing-library/jest-dom: ^5.16.4 => 5.16.4 @testing-library/react: ^13.3.0 => 13.3.0 @testing-library/user-event: ^13.5.0 => 13.5.0 aws-amplify: ^4.3.30 => 4.3.30 react: ^18.2.0 => 18.2.0 react-dom: ^18.2.0 => 18.2.0 react-scripts: 5.0.1 => 5.0.1 web-vitals: ^2.1.4 => 2.1.4 npmGlobalPackages: @aws-amplify/cli: 8.4.0 corepack: 0.10.0 n: 8.2.0
Describe the bug
DataStore uses scans even though syncExpression is defined against the GSI. I have created a new amplify project with minimal model exactly as defined in the docs
SyncExpression which is not against the GSI is doing a scan as expected
SyncExpression against the GSI is doing a scan and a query together. Denying scans on the generated role will eliminate the scans, but no data will arrive to the client on the base scan, and with fullSyncInterval the following error will be emitted in DataStore subscription.ts “APPSYNC_ASSUME_ROLE is not authorized to perform: dynamodb:Scan”
Refreshing the page after 30 minutes, meaning DeltaSyncTTL has passed (default 30 minutes) and DataStore is being reconfigured - results in a scan
This is the syncExpression being used, there are 9 User(s) with lastName of 'smith' in the database.
Too many full re-syncs
fullSyncInterval and DeltaSyncTTL are reloading all of the models with all of their data to the client, using scan&query and only scan respectively. Beside the problem of using scans, I do not need every client to reload all of its data with every fullSyncInterval (default to 24 hours), I intend to set fullSyncInterval to a full year.
Same as above, I do not want a full re-sync of all the data, every time the user refresh the browser and DeltaSyncTTL has passed (default to 30 minutes)
Expected behavior
Reproduction steps
Code Snippet
Log output
No response
aws-exports.js