bug: access token expiration mismatch

ikupenov commented 1 year ago

Describe the bug

The getAccessToken() function sometimes returns access tokens that have already expired. I found two issues with that function:

Getting an access token by using fetchTokenByRefreshToken() seems to sometimes return access tokens that are already expired. E.g. the exp token inside the JWT is less than the current time. I'm not sure how that happens but the issue is most probably on the server side.
The second issue is using an arbitrary value for the expiresAt property inside the accessTokenMap. The way it's done right now means that there'll always be a mismatch between the real JWT expiry and the one present in the accessTokenMap. The expiresAt property should either be returned from the server, by using the same expiry that's used in the JWT, or by decoding the JWT on the client side and calculating the expiresAt value by using the JWT's exp value.

These two issues combined make it impossible to refresh the access token properly. The accessTokenMap.expireAt value is being used to determine whether the access token has expired and whether we need to refresh it using the refresh token. And in some cases that value is valid, meaning that the access token hasn't expired, but the JWT's exp value says otherwise and is being refused by the server, leading to 401.

This is a critical issue and needs to be resolved ASAP IMO. We're currently migrating from a different provider and this is a huge blocker for us and I suppose everyone else.

Expected behavior

accessTokenMap.expireAt should always be derived from the JWT's exp value.

How to reproduce?

Reduce the access token lifetime so you can force access token refresh frequently.

Context

[ ] Logto Cloud
[X ] Self-hosted, Logto version = 1.5, @logto/browser = 2.1.0
- [ X] Container (Docker image)
- [ ] Raw Node.js

Screenshots

CleanShot 2023-07-04 at 23 11 08@2x

ikupenov commented 1 year ago

I think I've also been hitting this issue when using the admin dashboard panel. Sometimes I'm getting 401s and have to wipe the app cache manually so I can forcefully logout and login again.

gao-sun commented 1 year ago

thanks for reporting, we're taking a look

charIeszhao commented 1 year ago

After internal discussion, we have come up some conclusions but there are still something we need to confirm with you.

For question 1: Getting an access token by using fetchTokenByRefreshToken() will unlikely end up with an expired token. Please confirm you did not meant getAccessToken() method, and also your system time is synced with the epoch time.

For question 2: This is indeed an potential issue in our SDK code. Previously after successfully fetched access token, we cached it in local storage, and calculated the expiredAt timestamp based on the time we acquired the token + token lifetime. However, this is not accurate as the token has already been issued some time earlier and the real expireAt should be and be consistent with exp property decoded from the JWT. Therefore in extreme cases (in a very short period of time window), when the token has expired, the cached timestamp might still be considered valid, which might further cause the 401 issue. In order to solve this, we did some research and found the RFC recommendation when handling cached ID token expiration time.

The iat Claim can be used to reject tokens that were issued too far away from the current time, limiting the amount of time that nonces need to be stored to prevent attacks. The acceptable range is Client specific.

We think this is something we can learn from when dealing with our cached access tokens, by specifying a shifted time window backwards. Or in other words, the expireAt should be Now + expiresIn - shiftedTime, and we decided to set the shiftedTime to 60s. This guarantees the calculated expireAt will always be smaller than the real exp timestamp, and the SDK will automatically issue a new one before it actually expires. Thus, no more 401 errors caused by this.

Let us know if you have any more inputs or any thoughts. Thanks again for the bug reporting.

ikupenov commented 1 year ago

Thank you for taking a look at this.

I meant fetchTokenByRefreshToken() although I'm not sure how to reproduce this. I ran into this accidentally as I was getting a lot of 401s on my app. After debugging, I noticed there was ~1hour difference between the exp claim and the expireAt property. My time is synced with the epoch time.
I think this partially solves the issue but I still don't see how this is the most optimal solution. In the same specification you sent it's mentioned that the client must verify that the current time must be before the time represented by the exp claim:
1. The current time MUST be before the time represented by the exp Claim.

Regarding the point you made about:

The iat Claim can be used to reject tokens that were issued too far away from the current time, limiting the amount of time that nonces need to be stored to prevent attacks. The acceptable range is Client specific.

I don't think the expiresAt value should be calculated by using Now + expiresIn - shiftedTime but rather issueAt (iat claim) + expiresIn (or any other client-specific value that's less than expiresAt) - shiftedTime (maybe not needed). Correct me if I don't understand this right.

To summarize, I don't think it's a good idea to use Date.now() when you can use either the exp or iat claim to get a much more consistent result across the client and the server.

charIeszhao commented 1 year ago

OK, we'll keep an eye on the fetchTokenByRefreshToken() issue, let us know if you manage to reproduce it.
I understand it would be perfect to keep the client side expiresAt consistent with the token exp or iat + expiresIn (they are the same). However, the consumer of the token is the auth server, not the client. So it's really unnecessary for the client to decode the JWT and know the exact expiration time of it. The purpose of keeping the expiresAt on the client is only to know when to request a new access token, so being that milisecond-perfect doesn't help that much.

But you are right, the previously suggested approach of involving a fixed shiftedTime is also not good. And after a second thought, I think we could make it simpler by just marking down the timestamp (requestedAt) before requesting the new access token. So the expiresAt would be calculated by requestedAt + expiresIn, and the difference with exp would only be the actual request time, which is ideally 100-300ms at most. How about this approach?

ikupenov commented 1 year ago

While I agree that this is a better solution than the previous one, it's still far from the preferred one IMO. In the OpenID Connect specification, it's clearly stated what steps the client must take in order to verify the ID token. One of them is verifying the exp claim, along with many others. In order to do that the client MUST decode the JWT (ID token). After taking a closer look at your client-side SDK code I can see that the ID token is not verified at all. I am not a security expert and can't tell you why all of that needs to be done, but I can tell you that it's present in the specification. It's also implemented that way in the Auth0 (acquired by Okta, a billion-dollar company) client-side SDK. Please check their source code, and more precisely these parts:

https://github.com/auth0/auth0-spa-js/blob/master/src/jwt.ts#L63C14-L63C20 https://github.com/auth0/auth0-spa-js/blob/master/src/Auth0Client.ts#L1111

~So not only is the current code bug-prone, it probably poses a security vulnerability as well.~

charIeszhao commented 1 year ago

While I agree that this is a better solution than the previous one, it's still far from the preferred one IMO. In the OpenID Connect specification, it's clearly stated what steps the client must take in order to verify the ID token. One of them is verifying the exp claim, along with many others. In order to do that the client MUST decode the JWT (ID token). After taking a closer look at your client-side SDK code I can see that the ID token is not verified at all. I am not a security expert and can't tell you why all of that needs to be done, but I can tell you that it's present in the specification. It's also implemented that way in the Auth0 (acquired by Okta, a billion-dollar company) client-side SDK. Please check their source code, and more precisely these parts:

https://github.com/auth0/auth0-spa-js/blob/master/src/jwt.ts#L63C14-L63C20 https://github.com/auth0/auth0-spa-js/blob/master/src/Auth0Client.ts#L1111

So not only is the current code bug-prone, it probably poses a security vulnerability as well.

Hey, I think you are dragging the topic to another one. ID token and access token are different, and thus the stragegies are different, too. And we DO verify the ID tokens, you can check the source code right there: https://github.com/logto-io/js/blob/master/packages/client/src/index.ts#L294

charIeszhao commented 1 year ago

The ID token works as a cache of user claims on the client side, requiring the client to verify its authenticity before storing it.

On the other hand, the access token is consumed by the auth server, making it the responsibility of the server to ensure its validity whenever a request occurs. The client simply needs to transmit the access token to the server, instead of verifying it on behalf.

ikupenov commented 1 year ago

There was miscommunication and misunderstanding on my part, apologies, and thank you for clarifying :). The suggested solution should work perfectly fine then.

charIeszhao commented 1 year ago

No problem, man. Will soon create a PR based on the updated approach.

Thanks again for the bug report, and we embrace all these kinds of discussions, too! Do not hesitate to let us know if you find any other security vulnerabilities.

logto-io / js