FusionAuth / fusionauth-issues

FusionAuth issue submission project
https://fusionauth.io
90 stars 12 forks source link

Exchange SSO refresh token for access token API call results in an HTTP 500 Error #2594

Closed tomzorz closed 5 months ago

tomzorz commented 6 months ago

Exchange SSO refresh token for access token API call results in an HTTP 500 Error

Description

I'm trying to call the "exchange refresh token for access token" API, and I'm getting an HTTP 500 error. This API has worked in the past. I know if I provide it a random string for refresh token, it will complain it wasn't a proper token - so the code does run partially. But otherwise I get no information back besides '{"fieldErrors":{},"generalErrors":[{"code":"[Exception]","message":"FusionAuth encountered an unexpected error. Please review the troubleshooting guide found in the documentation for assistance and the available support channels."}]}'. I don't even see the error in the logs of our instance.

Affects versions

1.46.0, 1.48.3.

Steps to reproduce

Original steps to reproduce

Invoke the API with a valid refresh token.

Expected behavior

We get an access token.

Screenshots

Nothing useful.

Platform

Community guidelines

Release notes

Improved validation and error messaging when calling the /oauth2/token endpoint with an incorrect refresh token.

robotdan commented 6 months ago

Can you please provide recreate steps? Be as specific as possible, if you can recreate with a curl command or something like that, this would be helpful.

tomzorz commented 6 months ago

Sorry @robotdan I totally missed your original response here. Here's the curl command:

curl -v -XPOST -H 'User-Agent: python-requests/2.28.1' -H 'Accept: */*' -H 'Connection: keep-alive' -H 'Content-Type: application/x-www-form-urlencoded' -d 'refresh_token=REDACTED&client_id=REDACTED&client_secret=REDACTED&grant_type=refresh_token' 'https://REDACTED/oauth2/token' --output -

I am happy to provide you with the secret details somewhere privately.

Response from cURL {"fieldErrors":{},"generalErrors":[{"code":"[Exception]","message":"FusionAuth encountered an unexpected error. Please review the troubleshooting guide found in the documentation for assistance and the available support channels."}]}(

tomzorz commented 6 months ago

I realize this is not much more info than I provided, but I also don't want to leak our secret keys or my refresh key. As I mentioned, if I put in a garbage string for refresh key, it will say The refresh_token is invalid, so the logic clearly grabs the token, just fails somewhere afterwards.

mooreds commented 5 months ago

@tomzorz Hmmm. sorry to hear this.

I have tried to replicate this on the sandbox.fusionauth.io environment, which is running 1.48.3, and was unable to. I did the following:

Additional questions that will help us troubleshoot:

Again, sorry this is an issue. We'll need some help from you to troubleshoot this further.

tomzorz commented 5 months ago

does this happen repeatedly or intermittently

Every single time with every request.

does making a second request resolve the error

No.

does this occur only in your prod environment or locally

We're using a deployment managed by FA, I guess it's a "prod" environment in that way.

does this occur on the latest version of FusionAuth (1.48.3)

Our deployed version is at 1.46.0, I haven't seen any related issues in the changelog since (although admittedly we should probably update)

how are you acquiring the refresh token (login api, oauth authorization code grant)

OAuth code grant.

are there any errors in the event log (System -> Event Log in the admin UI) or system log (System -> Logs) when this occurs

We see errors when other errors occur, there are no log events for this error though. I finally noticed this in the "raw" system logs (stacktrace shortened for brevity):

2024-01-18 06:29:40.743 AM ERROR io.fusionauth.app.primeframework.error.ExceptionExceptionHandler - An unhandled exception was thrown
java.lang.NullPointerException: Cannot invoke "java.util.UUID.equals(Object)" because "<parameter1>.refreshToken.applicationId" is null
    at io.fusionauth.api.service.oauth2.DefaultOAuthService.validateTokenEndpointRefreshTokenGrant(DefaultOAuthService.java:2167)
    at io.fusionauth.api.service.oauth2.DefaultOAuthService.validateTokenRequest(DefaultOAuthService.java:1252)
    at io.fusionauth.app.action.oauth2.TokenAction.post(TokenAction.java:125)
    at jdk.internal.reflect.GeneratedMethodAccessor91.invoke(Unknown Source)

are there any patterns (occurring at a certain time, for a certain user, for a certain application, when you make a large number of api calls)

It occurs every time, for every user as far as I can tell.

what are your refresh token settings (Tenants -> your tenant -> JWT -> refresh token settings in the admin UI)

Fixed expiration policy, 2 weeks (20160 minutes), one-time use, revocation on action preventing login & password change.

can you replicate this issue on sandbox.fusionauth.io (that environment is public and reset nightly, so sharing details about that in a GH issue is okay)

I can't. I also created a PoC replica of the calls with your python library we're using for the sandbox env, couldn't replicate it there either.

I made sure to match the sandbox environment as close as I could get it to ours. I am also using the exact same python calls.

mooreds commented 5 months ago

Hmmm.

I stood up a 1.46.0 local server and tried to replicate this. I was unable to do so.

I also created a PoC replica of the calls

Was this PoC using 1.46.0?

tomzorz commented 5 months ago

The PoC script used our logic to target our 1.46 env, and then I switched it out to the sandbox 1.48.3 one - essentially to test and make sure the issue isn't somewhere in the python library or something. But yeah, couldn't replicate in the sandbox, only in ours.

I guess I'll try updating to 1.48.3 and see if that helps. As far as I know, we didn't make any configuration or code changes... but also we haven't been reliably using the exchange refresh token feature in recent time so I am not sure when it might've broke. I remember initially when we implemented it, it worked. And then at some point in the past ~8 months it broke.

mooreds commented 5 months ago

Okay. Please let us know what happens with the 1.48.3 update and we can discuss next steps.

tomzorz commented 5 months ago

@mooreds I have upgraded to 1.48.3 and the issue still persists the same way. Stacktrace looks similar:

2024-01-23 05:50:12.852 AM ERROR io.fusionauth.app.primeframework.error.ExceptionExceptionHandler - An unhandled exception was thrown
java.lang.NullPointerException: Cannot invoke "java.util.UUID.equals(Object)" because "<parameter1>.refreshToken.applicationId" is null
    at io.fusionauth.api.service.oauth2.DefaultOAuthService.validateTokenEndpointRefreshTokenGrant(DefaultOAuthService.java:2185)
    at io.fusionauth.api.service.oauth2.DefaultOAuthService.validateTokenRequest(DefaultOAuthService.java:1270)
    at io.fusionauth.app.action.oauth2.TokenAction.post(TokenAction.java:125)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

I have also expanded my test script a little to verify that we do properly save and restore the refresh token from the client - the retrieve refresh tokens endpoint gives me back the same one from the session:

image

I also tried out the other "exchange refresh token for jwt" call, that just returns a completely empty 404 and there isn't even anything in the regular log or the event log.

mooreds commented 5 months ago

Okay, I have replicated the issue.

We need to fix this, but for now I'd recommend filtering any refresh tokens you get and removing any that don't have an application Id. Refresh tokens that don't have an application Id are for internal use only (they represent our SSO session) and shouldn't be presented to the /oauth2/token endpoint.

I updated the documentation ( https://github.com/FusionAuth/fusionauth-site/pull/2836 ).

Please try that and let me know if you still see the issue.

tomzorz commented 5 months ago

Hmm, okay I'll try and see what tokens do we have / get.

robotdan commented 5 months ago

The root issue appears to be that you are using an SSO token - which is a special kind of refresh token. The correct behavior of the API should be to fail with a validation error indicate the token was not found.

robotdan commented 5 months ago

Internal:

tomzorz commented 5 months ago

Okay so looks like our logic right now takes first token from the "get refresh tokens" API and uses that after we succeed with the "exchange oauth code for access token with pkce" call. I imagine this is where we mess up currently.

Is there a reason btw why these "internal" refresh tokens are included in the "get refresh tokens" call? If I/we can't use these for anything?

mooreds commented 5 months ago

Is there a reason btw why these "internal" refresh tokens are included in the "get refresh tokens" call? If I/we can't use these for anything?

You might want to revoke them or otherwise examine them. See https://github.com/FusionAuth/fusionauth-site/pull/2836#discussion_r1463749496 from @robotdan .

tomzorz commented 5 months ago

Ok so one more question, is there some setting we might not have turned on to get these "proper" refresh tokens? I just did a query in our logs and in the last 90 days we barely have any results where we do actually have one:

image

I just did multiple logins on my own account, always had "keep me logged in" turned on - and yet no applicable refresh tokens in our logs today. (To note, this API is called every time when someone logs in.)

mooreds commented 5 months ago

"keep me logged in" is for SSO refresh tokens (the ones you should avoid)

the other one requires you request the offline_access scope. More here: https://fusionauth.io/docs/lifecycle/authenticate-users/oauth/endpoints#authorization-code-grant-request

tomzorz commented 5 months ago

Ohhhh, okay. TIL :) I think I just made the connection in my head, what the difference is exactly and why it is the way it is. Although I will say, as an end-user I would imagine that checkbox do what I thought we were doing :)

mooreds commented 5 months ago

I hear you! We need more 201 level content. Sorry for the confusion!

tomzorz commented 5 months ago

All good, I have def needed to learn more too. Also side-question: is there a way then for us to add a "keep me logged in" option that does what we thought the current one does? Or do we have to set the scopes early, and even before redirecting to the flow have the option set for the user whether they want it or not?

robotdan commented 5 months ago

Will be fixed in 1.49.0.

robotdan commented 5 months ago

Also side-question: is there a way then for us to add a "keep me logged in" option that does what we thought the current one does? Or do we have to set the scopes early, and even before redirecting to the flow have the option set for the user whether they want it or not?

If I understand the question... you are asking how to maintain state in your app - outside of the FusionAuth SSO?

Checking the box "keep me logged in" will just cause us to create an SSO session. This has a configured TTL.

If you want to use tokens in your own app to maintain user state, and you want to use a refresh token for that purpose - then you'd ask for the offline_access scope when you build the request to FusionAuth. Then you could store that refresh token on your end and refresh it until it expires to maintain the user state.

mooreds commented 5 months ago

@tomzorz and to add what @robotdan said, this seems like it is veering away from "reporting a bug" toward "asking for support" :) .

The best way for you to get support is to either:

More on technical support here: https://fusionauth.io/docs/operate/troubleshooting/technical-support