aws-amplify / amplify-js

A declarative JavaScript library for application development using cloud services.
https://docs.amplify.aws/lib/q/platform/js
Apache License 2.0
9.44k stars 2.13k forks source link

User Sessions Discarded from AsyncStorage in Android #13930

Closed brianlenz closed 1 week ago

brianlenz commented 1 month ago

Before opening, please confirm:

JavaScript Framework

React Native

Amplify APIs

Authentication, GraphQL API

Amplify Version

v6

Amplify Categories

auth, api

Backend

Amplify CLI

Environment information

``` # Put output below this line System: OS: macOS 15.0.1 CPU: (10) arm64 Apple M1 Max Memory: 338.55 MB / 64.00 GB Shell: 5.2.37 - /opt/homebrew/bin/bash Binaries: Node: 22.9.0 - /opt/homebrew/bin/node Yarn: 1.22.19 - /opt/homebrew/bin/yarn npm: 10.9.0 - /opt/homebrew/bin/npm pnpm: 7.27.1 - /opt/homebrew/bin/pnpm Watchman: 2024.10.07.00 - /opt/homebrew/bin/watchman Browsers: Brave Browser: 129.1.70.126 Chrome: 129.0.6668.103 Safari: 18.0.1 npmGlobalPackages: @aws-amplify/cli: 12.8.2 gatsby-cli: 5.13.3 npm: 10.9.0 serverless: 3.38.0 ```

Describe the bug

Similar to #13830 (which was for Amplify v5). I had reported what appeared to be the same issue in Amplify v6, but @HuiSF corrected me that it is actually unrelated, so I gathered more details to file this bug.

We are using React Native 0.75.4 with aws-amplify@npm:6.6.2 and @aws-amplify/auth@npm:6.4.2. The issue appears to have started happening since we upgraded from aws-amplify@npm:6.5.0 and @aws-amplify/auth@npm:6.3.13. Note that as part of this upgrade, we also upgraded @react-native-async-storage/async-storage from 1.23.1 to the latest 2.0.0.

The heart of the issue is that users in Android seem to get signed out of the app, typically after about a day. We have refresh tokens set to 3,650 days, so users should never be signed out automatically. To test, I set ID and access tokens to have a 5 minute expiry. I've tested restarting the app after the 5 minute expiry numerous times within a 5-60 minute window, and I've not been able to reproduce the issue a single time that way. I have, however, reproduced the issue 3 times by simply waiting overnight and trying again in the morning. I don't know if there is any meaning behind that or if I've just been "lucky" to trigger whatever the root cause is.

When the user remains signed in when launching the app, we can see all of the data as expected in AsyncStorage, e.g. with these keys:

@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.215625b1-66fa-4bd7-9697-43f2def478cc.accessToken
@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.215625b1-66fa-4bd7-9697-43f2def478cc.clockDrift
@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.215625b1-66fa-4bd7-9697-43f2def478cc.idToken
@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.215625b1-66fa-4bd7-9697-43f2def478cc.refreshToken
@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.215625b1-66fa-4bd7-9697-43f2def478cc.signInDetails
@MemoryStorage:CognitoIdentityServiceProvider.5m6h72v17i1red9rigt4ll8mo4.LastAuthUser

When the bug here occurs, we only see one Congito entry in AsyncStorage:

@MemoryStorage:com.amplify.Cognito.us-west-2:98089982-36b6-4356-b5b8-bd43222e5334.identityId

Of course, since there are no tokens, the user is signed out.

We call getCurrentUser() at app launch, which is throwing the UserUnAuthenticatedException since the tokens aren't found in AsyncStorage.

I've turned on debug logging with ConsoleLogger.LOG_LEVEL = 'DEBUG';, but I don't see any logs from Amplify prior to seeing the data in AsyncStorage.

There are no network errors or issues affecting these tests.

Is this a bug in Amplify? Is it possibly a bug in @react-native-async-storage/async-storage?

Expected behavior

Tokens should remain in AsyncStorage, and users should remain signed in.

Reproduction steps

It's difficult to reproduce unfortunately. I've reproduced it 3 times out of about 100+ tests, most often after a 12+ hour delay in launching the app.

Code Snippet

// Put your code below this line.

Log output

``` // Put your logs below this line ```

aws-exports.js

No response

Manual configuration

No response

Additional configuration

No response

Mobile Device

No response

Mobile Operating System

No response

Mobile Browser

No response

Mobile Browser Version

No response

Additional information and screenshots

No response

HuiSF commented 1 month ago

Hi @brianlenz thanks for opening this issue. Could you help clarify:

I have, however, reproduced the issue 3 times by simply waiting overnight and trying again in the morning. I don't know if there is any meaning behind that or if I've just been "lucky" to trigger whatever the root cause is.

When you were able to reproduce issue, did your app invoke only the getCurrentUser() API?

Does your app automatically invoke any other Amplify APIs on resume to the foreground? From you description, when the AS contained only @MemoryStorage:com.amplify.Cognito.us-west-2:98089982-36b6-4356-b5b8-bd43222e5334.identityId, a "signing out" operation must've been performed somewhere else.

In addition, do you happen to have logs of the outgoing network requests around when the error occurred?

Note that as part of this upgrade, we also upgraded @react-native-async-storage/async-storage from 1.23.1 to the latest 2.0.0.

When you were able to reproduce the error, did you sign in the end user with version 1.23.1 and reproduced with 2.0.0? Or the sign in also performed after upgrading to 2.0.0?

brianlenz commented 1 month ago

@HuiSF thanks so much for your attention and help šŸ™

When you were able to reproduce issue, did your app invoke only the getCurrentUser() API?

The log of the AsyncStorage happened before invoking getCurrentUser(). One of the steps the app does at startup is check internal Redux state (for signed in user) against the Amplify state via getCurrentUser() to ensure they are consistent. In this case, the user's data was all in Redux state (persisted to AsyncStorage), but the Amplify token state didn't appear there.

Does your app automatically invoke any other Amplify APIs on resume to the foreground? From you description, when the AS contained only @MemoryStorage:com.amplify.Cognito.us-west-2:98089982-36b6-4356-b5b8-bd43222e5334.identityId, a "signing out" operation must've been performed somewhere else.

No, we do not. I audited the full startup process, and this component is essentially the first component rendered after we call Amplify.configure(). I agree that it looks like a sign out must have occurred, but there is no way I'm aware of that it could have been triggered by our code šŸ¤”

In addition, do you happen to have logs of the outgoing network requests around when the error occurred?

Unfortunately, I do not. Do you have any recommendations on the best way to capture these in RN?

When you were able to reproduce the error, did you sign in the end user with version 1.23.1 and reproduced with 2.0.0? Or the sign in also performed after upgrading to 2.0.0?

These users were signed in with 2.0.0, so 1.23.1 isn't involved at all in the persistence.

If the issue is with @react-native-async-storage/async-storage, I might expect other storage to be wiped out, but our Redux state seems to have no issues. It's being persisted and read consistently.

I'm going to keep adding more logs and trying to reproduce and debug to provide more insight šŸ¤ž

HuiSF commented 1 month ago

Thanks for the details @brianlenz

If the issue is with @react-native-async-storage/async-storage, I might expect other storage to be wiped out, but our Redux state seems to have no issues. It's being persisted and read consistently.

It sounds like you are rehydrate Redux state by reading data from AS, what AS APIs are you using for reading/updating the AS with your Redux store?

brianlenz commented 1 month ago

We are using react-redux@npm:8.1.3 and redux-persist@npm:6.0.0 (via PersistGate). It only serializes/deserializes the key persist:root from AS. We don't directly call any AS APIs, nor have the versions of react-redux and redux-persist that we are using changed recently (so it feels unlikely that those would affect the issue we're seeing here)...

brianlenz commented 1 month ago

@HuiSF adding some additional context I forgot to mention originally. One other change that we made in this release is we switched from using Apollo Client (w/ aws-appsync-auth-link) to the built-in Amplify client via generateClient(). When signed in, we are always using generateClient({ authMode: 'userPool' }), and for guests, we use generateClient({ authMode: 'iam' }). The only thing I can think of where generateClient() could play a role is if somehow generateClient() has a bug internally (similar to the race condition in #13830) that could prevent a user's access token from being refreshed. If an error occurred somehow in that process, perhaps it could result in the user being signed out in Amplify? The thing is, if that happened during user activity, I'd think we'd see users reporting issues of being signed out while using the app, but we've now had dozens of users (seems to be Android only, still) who report that "each day" they have to sign back in (most of our users use the app once per day, most often in the morning). We have no reports of the issue while the app is being actively used šŸ¤” It seems to be an issue that occurs only after the access token expires (and we have ID/access token expiry set to 60 minutes on production).

We're still continuing to investigate, but thought I'd post this in case it triggers any ideas or thoughts on your end šŸ™

HuiSF commented 1 month ago

Thanks for the additional information @brianlenz I was able to reproduce this unexpected token clearing. Apologies for my misunderstanding, the getCurrentUser() does trigger a service call when the accessToken is expired. I will investigate further and isolate the root cause.

brianlenz commented 1 month ago

Great, thanks, @HuiSF! Was it happening on Android for you, too? We've not had a single complaint from iOS users.

Note that in the most recent case where I reproduced the issue, the data was cleared in AS before getCurrentUser() was called šŸ¤” That's one thing that definitely confused me (as I logged the contents of AsyncStorage first, which showed the tokens were cleared, and then called getCurrentUser()).

Please do let me know if you identify reproduction steps, a root cause, or any potential workarounds (even if a patch or package downgrade). At this point, it's a particular pain point for our users, so I'm hoping to address is sooner than later.

Thanks again!

brianlenz commented 1 month ago

@HuiSF one more potentially relevant detail. We have a background handler for processing push notifications. This could explain why we're noticing a difference between Android and iOS, as they handle push notification processing differently. I noticed today in Android that one test emulator received a couple of "Unauthorized" errors when processing a push notification message (which was attempting to load data via GraphQL). What's interesting is that in spite of those errors, the user did remain signed in.

cwomack commented 1 month ago

@brianlenz, appreciate the follow up and additional context. We're looking into a fix and will follow up with any updates/progress as soon as we can!

HuiSF commented 3 weeks ago

Hi @brianlenz we release v6.6.7 with should fix this unexpected token clearing behavior. Details please see the linked PR description. Please upgrade and test again, thanks!

brianlenz commented 3 weeks ago

Thank you so much, @HuiSF! We will work on getting that tested and deployed to see if it fixes the issue.

Did you have any good ideas of how to reliably reproduce the issue in Android so that we could confirm the issue exists on the current version and that it's fixed in 6.6.7?

HuiSF commented 3 weeks ago

Hi @brianlenz I was able to reproduce with the following conditions:

  1. The persisted access token is expired
  2. The device has spotty network connection

The expected behaviors is:

  1. If the refresh token request failed due to network error, it should retry
  2. If all thr requests failed due to network error, it should not clear the tokens stored in the AsyncStorage
brianlenz commented 1 week ago

@HuiSF thanks so much for the fix here! šŸ™ I can confirm that the issue has been resolved, so I'm going to close this šŸš€

cwomack commented 1 week ago

Thanks for the confirmation and creation of this issue, @brianlenz.