AzureAD / microsoft-authentication-library-for-java

Microsoft Authentication Library (MSAL) for Java http://aka.ms/aadv2
MIT License
284 stars 142 forks source link

demo_servlet.AuthHelper.processAADCallback Unable to exchange auth code for token: ValidateState() indicates states do not match. #711

Closed kilokili777 closed 9 months ago

kilokili777 commented 11 months ago

We deployed MSALJ 1.13.10 to perform authentication for a tomcat based web servlet application using Azure AD. MSALJ SSO works correctly with all Chrome browsers and "most" Edge browsers, but we see a certain percentage of Edge browsers that fail MSALJ authentication (even though Azure AD reports good authentication).

HOWEVER, if we instruct the user to "sign out" of the non-working SSO Edge account in the browser, --then MSALJ SSO works in that same Edge browser (with no Edge browser user account logged in)! Keep in mind most of the Edge browsers work even with a user logged into the Edge browser account, it is just a small-ish percentage that fail. ALSO, if we use the IE mode in Edge, the SSO will work.

We see this from the MSALJ log when an Edge browser fails SSO. What does it mean?: 14-Sep-2023 18:59:51.536 WARNING [https-jsse-nio-8443-exec-3] demo_servlet.AuthHelper.processAADCallback Unable to exchange auth code for token: ValidateState() indicates states do not match.

kilokili777 commented 11 months ago

NOTE: If we change the Edge profile to a generic one (from the default logged in "work" profile), then SSO will work. But this is not a solution. Why will some default "work" profiles allow MSALJ SSO, but some do not? Ideas?

Avery-Dunn commented 11 months ago

Hello @kilokili777 : When you say you 'deployed MSALJ 1.13.10', does that mean you were using another version of MSAL Java before and started getting this issue after updating? Or is the first time you've used this package?

Also, I noticed the error message mentions the class 'AuthHelper' and method 'processAADCallback'. Is your app based on one of our servlet samples that has that class/method? If so this might be an issue with how the sample handles sessions rather than any issue in the actual msal4j package.

kilokili777 commented 11 months ago

Thank you. We have always used MSALJ 1.13.10. Yes, we modified one of the MSAL servlet samples for authetication into a tomcat valve to handle SSO at the container level. I can upload our source code, or where do you recommend we seek help? I do agree it appears to be some sort of session/state handling issue affecting select Edge browsers, but does resolve if the edge account is logged out of.

kilokili777 commented 11 months ago

Given that logging out of the edge browser account allows the MSAL SSO to complete, what things should we look at? We tried clearing the browser cache. Why would a user logged into the edge browser account sometimes cause an MSAL state mismatch?

kilokili777 commented 11 months ago

AuthHelper.java.txt

kilokili777 commented 11 months ago

I think I have found a clue. Our enterprise rolled out Edge browser "account sync" across devices but then stopped due to license issues. So now, I have a user who has 4 computers each with edge browsers, and 2 of the computers/devices show up in his Edge Account -> settings -> Manage Account -> Device list, which are the Edge clients that DO NOT work with SSO (we get session/state mismatch errors from MSAL). The other edge browser clients are not in the device lilst, and DO work with MSAL SSO.

So I guess my question becomes how does MSALJ play with MS Edge browser account syncing with multiple devices?

Avery-Dunn commented 11 months ago

Thanks for all the investigation and info. This definitely doesn't seem like a bug in MSAL itself, rather it's an issue with how our samples use Java to manage browser sessions.

The "ValidateState() indicates states do not match" error message is coming directly from the sample code: that sample shows how MSAL Java can be integrated a servlet app that maintains browser sessions, but isn't going to be able to cover all scenarios.

We can look into possible solutions for your scenario, but I don't have an ETA on when we'd be able to do that. And if it's just a case of Edge's SSO/account sync stuff not working well with Java's servlet stuff, then it might be out of scope for what our samples are supposed to cover.

kilokili777 commented 11 months ago

After reviewing the network traffic, it seems that when a deivce is registered and signed into the edge browser, it tries to force "Primary Refresh Token" (PRT) and present it to the MSAL servlet web app. But the MSAL web app is using Authorization Authentication Code Flow, which doesn't support PRT, and thus throws a 401 unauthorized. QUESTION: does MSAL have any settings that can handle the MS Edge PRT?

bgavrilMS commented 11 months ago

No, MSAL doesn't use PRTs explicitly. Only brokers (WAM, Authenticator) do that. The brokers operate at lower level in the OS stack and they actually intercept traffic to AAD and inject the device ID.

Are you getting a 401 from the resource or from AAD? Does the state error occur when you use Chrome?

kilokili777 commented 11 months ago

We are getting 401 from the resource (Java servlet/valve). AAD is authenticating sucessfully. Never get the error from chrome, and we never get the error from edge IF the edge browser account is not logged in, and we never get the error from edge IF the user IS logged in but the device list for the edge account doesn't contain the computer/deivce of the edge browser.

We are starting to think this has something to do with MSAL device code flow handling: https://github.com/AzureAD/microsoft-authentication-library-for-java/blob/dev/msal4j-sdk/src/samples/public-client/DeviceCodeFlow.java

bgavrilMS commented 11 months ago

I am not understanding your application architecture. Device Code Flow is a public client flow (desktop apps, command line apps), not a web site specific flow. Device Code Flow does NOT satisfy the device compliance.

If you have a web site, we have 2 samples that can help - Java Spring and Java Servlets. They both use auth_code flow.

If you have a public client application, such as a desktop app or a CLI app, we recommend that you use this: https://learn.microsoft.com/en-us/entra/msal/java/advanced/using-wam-and-the-msal4jbrokers-package

Just for information - Why device code flow doesn't work with device compliance Imagine you start to login from your TV, it prints - "go to aka.ms/devicelogin on your PC and enter code ABCDE". You then login from your PC and your TV gets an access token!

Your PC may be a managed device, but your TV, who gets the token, isn't. If someone steals your TV, you can't go to your tenant admin to say "hey, this TV got stolen, wipe it remotely".

Let's take a step back.https://learn.microsoft.com/en-us/azure/active-directory/develop/sample-v2-code?tabs=apptype#web-

bgavrilMS commented 11 months ago

Hi @kilokili777 - so I tried the web site Servlet sample and I ran into the same issue but it turns out it's a deployment issue. The algorithm is:

In my case I had configured the redirect URI incorrectly which seems to have wiped the session.state param.

kilokili777 commented 11 months ago

Unlike other browsers, Microsoft Edge adds the fields X-Ms-Refreshtokencredential and X-Ms-Devicecredential to the HTTP request header when accessing "login.microsoftonline.com/authorize". This causes the authentication to fail (if an edge broswer account is logged in, and is used with an edge broswer account profile device list).

Chrome browsers alway work. However, this behavior can be reproduced in Chrome by installing the "Windows Authentication" Extension: https://chrome.google.com/webstore/detail/windows-10-accounts/ppnbnpeolgkicgegkbkbjmhlideopiji?hl=en

Is there anything that can be done within the sample (e.g. AuthHelper.java or parameter set in ConfidentialClientApplication) to either suppress the sending of these two headers, or handle the flow?

bgavrilMS commented 9 months ago

No, these headers are injected by a Windows component (we have similar functionality on Mac and Linux, via Company Portal).

Avery-Dunn commented 9 months ago

Closing due to inactivity. If you're still having issues or have any related questions, feel free to reopen or start a new thread.