aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.42k stars 2.12k forks source link

DataStore sync processor rety error - about every 24 hours #12954

Open alex-breen opened 7 months ago

alex-breen commented 7 months ago

Before opening, please confirm:

JavaScript Framework

React

Amplify APIs

Authentication, DataStore

Amplify Version

v6

Amplify Categories

auth, api

Backend

None

Environment information

``` # Put output below this line System: OS: Windows 11 10.0.22621 CPU: (6) x64 Intel(R) Core(TM) i5-9600K CPU @ 3.70GHz Memory: 2.05 GB / 15.94 GB Binaries: Node: 18.18.0 - C:\Program Files\nodejs\node.EXE npm: 10.1.0 - C:\Program Files\nodejs\npm.CMD Browsers: Chrome: 121.0.6167.140 Edge: Chromium (121.0.2277.98) Internet Explorer: 11.0.22621.1 npmPackages: @aws-amplify/ui-react: ^6.1.2 => 6.1.2 @aws-amplify/ui-react-internal: undefined () @aws-amplify/ui-react-storage: ^3.0.11 => 3.0.11 @babel/cli: ^7.21.0 => 7.23.4 @babel/core: ^7.21.4 => 7.23.7 @babel/plugin-proposal-private-property-in-object: ^7.21.11 => 7.21.11 (7.21.0-placeholder-for-preset-env.2) @babel/preset-env: ^7.21.4 => 7.23.8 @emotion/react: ^11.10.6 => 11.11.3 @emotion/styled: ^11.10.6 => 11.11.0 @hello-pangea/dnd: ^16.5.0 => 16.5.0 @mui/base: ^5.0.0-alpha.124 => 5.0.0-beta.31 @mui/icons-material: ^5.11.16 => 5.15.4 @mui/material: ^5.11.16 => 5.15.4 @prezly/slate-lists: ^0.101.0 => 0.101.0 @testing-library/jest-dom: ^5.16.5 => 5.17.0 @testing-library/react: ^13.4.0 => 13.4.0 @testing-library/user-event: ^13.5.0 => 13.5.0 @types/react: ^18.2.47 => 18.2.47 @types/react-beautiful-dnd: ^13.1.8 => 13.1.8 @types/react-dom: ^18.2.17 => 18.2.18 @xenova/transformers: ./private_modules/@xenova/transformers => 2.6.0 aws-amplify: ^6.0.13 => 6.0.13 aws-amplify/adapter-core: undefined () aws-amplify/analytics: undefined () aws-amplify/analytics/kinesis: undefined () aws-amplify/analytics/kinesis-firehose: undefined () aws-amplify/analytics/personalize: undefined () aws-amplify/analytics/pinpoint: undefined () aws-amplify/api: undefined () aws-amplify/api/server: undefined () aws-amplify/auth: undefined () aws-amplify/auth/cognito: undefined () aws-amplify/auth/cognito/server: undefined () aws-amplify/auth/enable-oauth-listener: undefined () aws-amplify/auth/server: undefined () aws-amplify/datastore: undefined () aws-amplify/in-app-messaging: undefined () aws-amplify/in-app-messaging/pinpoint: undefined () aws-amplify/push-notifications: undefined () aws-amplify/push-notifications/pinpoint: undefined () aws-amplify/storage: undefined () aws-amplify/storage/s3: undefined () aws-amplify/storage/s3/server: undefined () aws-amplify/storage/server: undefined () aws-amplify/utils: undefined () compromise: ^14.10.0 => 14.11.0 eslint: ^8.55.0 => 8.56.0 eslint-plugin-react: ^7.33.2 => 7.33.2 eslint-plugin-react-hooks: ^4.6.0 => 4.6.0 eslint-plugin-react-refresh: ^0.4.5 => 0.4.5 is-hotkey: ^0.2.0 => 0.2.0 lodash: ^4.17.21 => 4.17.21 nlcst-to-string: ^3.1.1 => 3.1.1 node-window-polyfill: ^1.0.2 => 1.0.2 react: ^18.2.0 => 18.2.0 react-avatar-editor: ^13.0.0 => 13.0.2 react-color: ^2.19.3 => 2.19.3 react-dom: ^18.2.0 => 18.2.0 react-dropzone: ^14.2.3 => 14.2.3 react-scripts: 5.0.1 => 5.0.1 retext: ^8.1.0 => 8.1.0 retext-keywords: ^7.2.0 => 7.2.0 retext-pos: ^4.1.0 => 4.1.0 slate-history: ^0.100.0 => 0.100.0 slate-hyperscript: ^0.100.0 => 0.100.0 slate-react: ^0.101.5 => 0.101.5 svg-gauge: ^1.0.7 => 1.0.7 uuid: ^9.0.0 => 9.0.1 (8.3.2) web-vitals: ^2.1.4 => 2.1.4 npmGlobalPackages: @forge/cli: 4.1.1 atlas-connect: 0.8.1 firebase-tools: 10.7.0 mongodb-realm-cli: 2.4.2 npm-check-updates: 12.5.9 npm: 10.1.0 serve: 14.2.1 ```

Describe the bug

When waking up computer from sleep state, website throws a series of errors (uncaught, over website screen).

subscription.ts:84 Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'cognito:username')
    at subscription.ts:84:1
console.js:213 [ERROR] 33:31.645 DataStore - Sync processor retry error: Error: No current user
    at InternalGraphQLAPIClass._headerBasedAuth (InternalGraphQLAPI.ts:68:1)
    at async InternalGraphQLAPIClass._graphql (InternalGraphQLAPI.ts:168:1)
    at async sync.ts:119:1
    at async retry.ts:33:1

Everything resolves by itself but the user sees the browser error modal and has to close it. I don't have a clear way to reproduce this, but my hunch is that it occurs after the 24 hour "base query" sync from Datastore. I could be seeing a pattern that is not real, but it does seem to occur about once a day, after there has been some idle time (like computer at sleep). I pretty sure this started appearing after I upgraded to v6.

Here is the screenshot of from the browser's console logs. image

Here is the screen shot from the browser's network logs: image

Browser window screenshot with errors: image

Expected behavior

no errors

Reproduction steps

Wake up computer from sleep.

Code Snippet

// Put your code below this line.

Log output

``` // Put your logs below this line ```

aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

No response

cwomack commented 7 months ago

Hello, @alex-breen 👋 and sorry to hear you're running into this. It sounds like this could possibly be related to when DataStore is trying to re-establish the connection, but the tokens for a given user are potentially expired. Can you look into what you've configured for the expiration timeframe for your refresh and access tokens to? Does this happen in a production version of the app as well or just locally when in dev mode?

alex-breen commented 7 months ago

Hi @cwomack -

My cognito settings are:

Refresh token expiration: 30 day(s) Access token expiration: 1 hour(s) ID token expiration: 1 hour(s)

Does this happen in prod? Yes, I'm pretty sure at least. I'm sure I've seen the browser log errors, and pretty sure I've seen the red error overlay over the screen, but I'll watch for that to be sure.

<<Update: Happens in prod, but overlay with red messages doesn't appear on browser in prod, only console errors. In dev, overlay with red errors appears.>>

alex-breen commented 7 months ago

Update: On production, I've encountered the issue twice (about every 24 hours). The console logs show the errors:

Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'cognito:username')
ConsoleLogger.ts:85 [ERROR] 00:45.178 DataStore - Sync processor retry error: Error: No current user

But the browser screen doesn't get the overlay with red error messages. On the dev build, it does.

alex-breen commented 7 months ago

Another update (reproduction is slow because it seems to occur after several hours of sleep or inactivity):

@cwomack is there something I can do to catch this when it happens and restart Datastore?

alex-breen commented 7 months ago

I have Hub logs with today's failed sync. I hope this is helpful. .Steps to repro: wake up from sleep. What I see is that the Hub says the model is sync'd. However, it is not processing inbound updates from the backend. However, it is able to process outbound mutations.

image

alex-breen commented 7 months ago

Do you recommend I call Datastore.start() after the Hub event "ready"? My aim is to reset the sync process for cases in which it thinks it is syncing, but it is not. Will this kickstart it up again?

Note, I can't call Datastore.stop() and then Datastore.start() as that often results in an uncaught error because there are other observe queries in flight.

svidgen commented 6 months ago

@alex-breen As a potential path to a workaround while the repro of this is pending, do any of the auth events indicate the need for a Datastore restart?

https://docs.amplify.aws/react/build-a-backend/auth/auth-events/#listen-to-and-log-auth-events

alex-breen commented 6 months ago

@svidgen - I don't think so, as no error events are passed. Datastore events indicate everything is working (syncQueriesReady) and Auth is fine.

alex-breen commented 5 months ago

@cwomack - I've tried cycling a stop and start of Datastore every time there is a Hub networkStatus event. But the sync doesn't get restablished. The websocket connection (eg: wss://abc.appsync-realtime-api.us-east-1.amazonaws.com/graphql?header...) stays unresponsive. The Hub events imply success (proceeds to ready) and outbound mutations work, but inbound sync updates are not received.

alex-breen commented 5 months ago

To reproduce (artificially):

chrisbonifacio commented 5 months ago

Hi @alex-breen, thank you for the repro steps. Unfortunately, I have not been able to reproduce this issue with the provided steps.

These were the events logged from the Hub during my reproduction. In my case I might've run into a different issue - I never got another ready event after the first one.

Screenshot 2024-04-19 at 10 43 43 AM

I also observed that updating a record from another browser instance was reflected.

The highlighted record was updated from a second, incognito window. image

I am only able to get the Sync process retry error if I force it by attempting to start the sync process with an invalid access token (calling DataStore.start())

image

Are you sure that those steps are in the correct order?

alex-breen commented 5 months ago

Hi @chrisbonifacio - I've jotted down a few things that might help shed some light:

alex-breen commented 3 months ago

@chrisbonifacio - by any chance is there an update on this?

Even with my hacky workaround, I've seen Safari fail (silently), leading to data loss on an add request (amplify js doesn't throw an error, but sync fails to write, but app doesn't know this, so data is lost). At the very least, just a way to listen for when sync is broken so that I can force a reload would help.

alex-breen commented 2 months ago

@chrisbonifacio @cwomack - I've reproduced this with a very simple app that mostly uses the example code in Amplify documentation.

Behavior: Wake up computer after sleep. The last refresh of the webpage should be over 12 hours ago. Update data from a separate computer (or incognito browser). Result - The update does not appear on the primary browser. Data is now out of sync. Hub events report "syncQueriesReady" but websocket messages are not received.

I've attached screenshots of the console and network errors that appear right after wake up. I've also attached the relevant code. It's important that authentication is enabled and model schema has @auth(rules: [{allow: owner}]). I tested this running in build/vite-preview mode, locally (but I've reproduced it many, many times in production and in dev mode.) I don't have a direct way to trigger the error, other than load the page in the morning, put your computer to sleep at night, and wake up your computer the next morning. Also, if you refresh the page, the updated data syncs as the websocket is restored. What makes this issue particularly difficult is the Hub events indicate everything is working and sync'd, when it is not..

image image package.json schema.txt appDS.txt