okta / okta-react-native

OIDC enablement for React Native applications
https://github.com/okta/okta-react-native
Other
58 stars 39 forks source link

[iOS] - User not authenticated while tokens are not suppose to be expired #403

Closed Aeners closed 7 months ago

Aeners commented 10 months ago

Describe the bug?

Hello, my issue is occurring only on iOS and on a real device (everything works well on simulator).

Context: We have a GraphQL implementation working in pair with okta/react-native library. Our app has a bearer token authentication strategy alongside with a refresh token strategy. Before each GraphQL request, the app retrieves the user access_token by calling getAccessToken method. Our token lifetime is set to 1 hour, so if getAccessToken doesn't return the token (or throw an error), the app will call refreshToken method in order to get a new access_token. Then this token is passed to the request headers. In case the refreshToken method fails, we are kicking the user out of the authenticated part of the app.

Issue: Now the issue we are encountering, occurs with the getAccessToken and refreshToken methods, where after some time they don't seem able to retrieve data from the secure storage. So this leads in sessions getting terminated.

Hints?: We have updated okta/react-native to 2.10.0 in November, and the issue had been raised around that time. But we have tried to downgrade the package to 2.8.0 (which was our previous version) and the issue was still there.

What is expected to happen?

getAccessToken shouldn't throw an error (or throw an error) when being called inside the token lifetime range. Same for refreshToken.

What is the actual behavior?

After some time (quite random but always less than one hour), both methods, getAccessToken and refreshToken start to throw an error whereas the token is not expired.

Reproduction Steps?

I tried to be precise in the bug description, here are some other technical details:

I don't have the time right now to create a reproduction repo, but if the issue is not clear enough I could try to do one

Additional Information?

The error thrown is: [Error: User is not authenticated, cannot perform the specific action]

SDK Version

2.10.0

Build Information

No response

mikenachbaur-okta commented 10 months ago

Hi @Aeners, thank you for the clear description. I understand how it can be difficult to create a reproduction sample, so if you don't mind helping me to narrow the problem down, that could be great.

Aeners commented 10 months ago

Hi @mikenachbaur-okta, thanks for your answer. First, I should I've added the configuration we are using to create the client, here it is:

image

Here are some answers for your questions:

Differing on device/server local times

About local time of the devices, What do you mean by the local time on the physical device is correct? How can we check/debug that? Are we talking about timezone bound to the Okta user and timezone in the setting's device? To add a little bit more context we have reproduced the disconnections by someone in our stakeholders, who really uses the app in production.

More debugging

I have added a lot of debugging near our getAccessToken and refreshTokens usage, in a screen polling a request each 5 minutes. I removed the part of the code that kicks the user out from the authenticated side of the app to keep polling the request even though the user is supposed to be disconnected. Also as suggested, if the accessToken and refreshTokens methods fail, the code will call getUser and introspectAccessToken. This resulted in:

  1. Log in at 3:15pm, here everything's is working. Our access token lifetime is set to 1h so after each hour passed, the getAccessToken failed accordingly to the plan, and refreshTokens method is returning a new one to be used for the next hour.
  2. No issues until 7:32pm. Note that the token got refreshed at 7:17pm and then is supposedly expiring at 8:17pm.
  3. Requests 7:32pm, 7:37pm, 7:42pm, 7:47pm, 7:53pm failed one after the other. a. For each of these, getAccessToken, refreshTokens, introspectAccessToken thrown the error [Error: User is not authenticated, cannot perform the specific action] b. getUser didn't throw any error but returned a "dumb" response:
    {
    "_h": 0,
    "_i": 3,
    "_j": {
    "_h": 0,
    "_i": 0,
    "_j": null,
    "_k": null
    },
    "_k": null
    }
  4. And surprise, 7:58pm started working again 🫢 withgetAccessToken retrieved the token which was "refreshed" at 7:17pm. This I guess answers the fact that the token is still "valid".

What is behind this error [Error: User is not authenticated, cannot perform the specific action]. When this is raised, had the library already tried to retrieve local tokens? I will try to reproduce it again to check if I'm able to sniff a request to Okta that the library would have sent.

If you could need anything else feel free to ask, Thanks.

mikenachbaur-okta commented 10 months ago

@Aeners Thank you for your detailed response.

Regarding device/server times, what I mean is the system time of the device being tested. Many people disable NTP / time synchronization on their devices, and often adjust the time by large amounts. OIDC token validation uses the device's time to determine whether or not a token has expired, so if that offset is too great it can result in tokens being considered expired prematurely. The new Swift SDK includes support for automatic time synchronization to address this problem, but the legacy OIDC SDKs (and by extension the React Native SDK) do not.

In my mind this is the most likely cause of these errors, particularly since it's not reproducible on a simulator (which uses your macOS system time).

There are multiple Debugging steps that we can take from here to narrow it down. The legacy OIDC SDK has some peculiarities in how it retrieves and returns tokens, so it would be useful to figure out what's happening.

  1. When you get these errors, what result do you get from getAccessToken? Is it throwing an error, is it returning null, or is it returning a valid token string?
  2. If you are getting a token string, what happens if you try to use it directly? I'm trying to determine if the token is actually valid and you're getting false-negatives, or if the server considers it invalid. Since the getUser and introspect* functions go through the SDK, if its giving you errors, lets see if making a network request using the token directly works. Can you use the accessToken string and formulate a request to one of Okta's APIs to see if it works? For example, you can either call the userInfo endpoint, or you can use the introspect API directly to get information about the access token.

In summary, I see there as being three likely culprits:

  1. Time synchronization between the client and the server,
  2. Some problem within the Okta React SDK, or your application's lifecycle, that is causing the SDK to forget about its current tokens and think it's not signed in, or
  3. There's some other problem or configuration issue that's resulting in the token being invalid.

Thank you for your patience in trying to debug this issue.

Aeners commented 9 months ago

@mikenachbaur-okta Thanks for your response and for the explanation about device/server times.

I understand the idea of the possible drift between the device and the server time. But isn't it strange then that a fresh token is working correctly at first, then doesn't for some time and finally works again before getting expired. If it could help you to investigate this idea, I can check my device's settings which I used for debugging. (In the Date & Time iOS settings, I have the automatic setting toggled with Paris timezone set)

About the getAccessToken method (and as I mentioned refreshTokens and introspectToken have the same behaviour).

In our case, when getAccessToken doesn't retrieve the token, instead it throws an error with this specific message [Error: User is not authenticated, cannot perform the specific action]. Though I have saved the token and when it started failing I called Okta API using both userInfo endpoint and introspect endpoint, and both were returning a successful response.

I tried to sniff http requests that had been fired from my device while debugging, and it's probably worth noting that getAccessToken doesn't seem to trigger an http request while the token is in his lifetime range meanwhile we are calling it every 5 minutes. And it's also true when the method throws the error mentioned above. So this error is a logic that occurs locally in the device.

Thanks again for you help.

Aeners commented 9 months ago

@mikenachbaur-okta, sorry to be pushy on that one but our users are getting disconnected and we can't point any other culprits. Do you have any time to look into that, how could I assist you here? We are about to eject our app from using the SDK, and implement our own auth service. This would be a high effort implementation and we would prefer avoid doing so.

As a quick fix, we have stored ourselves the access token, and we let our back end test its validity. This prevent most the disconnections we had, but after some time we still have some. My guess is that they are probably related to "bad luck" having this "SDK lock" while the app needs to refresh the token. So the app call refreshTokens, receive the error mentioned before, and the authenticated session gets terminated.

Aeners commented 7 months ago

After making some progress in debugging our issue, I re-created a Github issue (#418) to remove the possible "noise" of this one. Then I'm closing myself that one.