Authentication timeout in a flow that uses `max_age` parameter has a 'send another authorisation code' button that doesn't allow login

AzureAD / microsoft-authentication-library-for-python

Microsoft Authentication Library (MSAL) for Python makes it easy to authenticate to Microsoft Entra ID. General docs are available here https://learn.microsoft.com/entra/msal/python/ Stable APIs are documented here https://msal-python.readthedocs.io. Questions can be asked on www.stackoverflow.com with tag "msal" + "python".

https://stackoverflow.com/questions/tagged/azure-ad-msal+python

Other

770 stars 192 forks source link

Authentication timeout in a flow that uses `max_age` parameter has a 'send another authorisation code' button that doesn't allow login #455

Open ineesalmeida opened 2 years ago

ineesalmeida commented 2 years ago

Describe the bug The optional max_age parameter in the ConfidentialClientApplication. initiate_auth_code_flow is compared against the time the user entered their password instead of the full authentication with the MFA, which leads to odd behaviours for authentication timeout. A user that takes to long to enter the MFA sees a 'this has timed-out, send another authorization code' but entering the MFA again will not work because the time will be past the max_age, so an error will be raised instead.

To Reproduce Steps to reproduce the behavior:

Initiate a auth code flow with the max_age parameter set to 60 seconds (for example)
I enter my password, but don't enter my MFA devices for a couple of minutes
I see that too much time has passed and a button saying 'send another authorisation code'
I press on that 'send another authorization code' button

Expected behavior I would expect one of two behaviors:

I re-enter my MFA code and I can login
I am asked to re-enter my password and MFA and I can login

What you see instead I re-enter my MFA code and a python error is raised: RuntimeError( RuntimeError: 13. auth_time (1642085298) was requested, by using max_age (60) parameter, and now (1642085502) too much time has elasped since last end-user authentication.

The MSAL Python version you are using 1.16.0

Additional context If a user takes too long to enter their password everything works fine since the max_age is evaluated since the password was entered. This is only an issue if the user takes to long to add their MFA code. This flow is also fine if after I enter my password and take too long to enter my MFA code, I click on 'cancel' and then try again, and it will ask for my password + MFA again.

rayluo commented 2 years ago

Hi Inees, thanks for reporting this case, and thanks for your patience.

While the issue is observable in MSAL Python, the auth_time info is determined by the AAD service. So, MSAL Python has no choice but to validate the auth_time and error out accordingly. We will report this to our service team via our internal channel.

Meanwhile, do you have to use such a short max_age=60 setting? AFAIK, due to some other historical reasons and implementation details (such as time-skew allowance), the max_age may not behave very precisely if you use very small value. Perhaps you can try use some 3 to 5 minutes setting and that would be a good enough workaround.

kevindixon commented 2 years ago

@rayluo I work with @ineesalmeida - the reason we're using such a short max_age is that we're in a highly regulated environment and have a requirement to force re-authentication for certain privileged actions by a user. As you have seen in other posts, we cannot actually always force re-authentication (see other posts WRT setting max_age=0). The best we can do with the Microsoft Identity Platform is set a max_age to some small value to mostly force re-authentication. We can justify this when the times are short (and thus the risk is low).... 3 to 5 minutes is unlikely to satify our regulatory powers.

We would love to understand how other organisations in such regulated environments actually implement similar requirements on the Microsoft platform - there appears to be little/nothing on the net about this.

rayluo commented 2 years ago

@rayluo I work with @ineesalmeida

Oh, right, now I remember we met before. :-)

we're in a highly regulated environment and have a requirement to force re-authentication for certain privileged actions by a user.

The best we can do with the Microsoft Identity Platform is set a max_age to some small value to mostly force re-authentication. We can justify this when the times are short (and thus the risk is low).... 3 to 5 minutes is unlikely to satify our regulatory powers.

We would love to understand how other organisations in such regulated environments actually implement similar requirements on the Microsoft platform - there appears to be little/nothing on the net about this.

Would you mind tell us what your regulated environment is? Could it be, for example, a financial or biomedical sector in UK? Having this kind of extra information may help our planning.

Additional context ... This flow is also fine if after I enter my password and take too long to enter my MFA code, I click on 'cancel' and then try again, and it will ask for my password + MFA again.

As an immediate workaround, you could indeed catch the RuntimeError, show a message in your UI to your users "please try to complete the entire sign-in within 1 minute", and then restart a new auth code flow. This way, at least the user won't see a scary error, and they should be able to succeed in their second attempt, because their MFA device is likely still in hand and they are warmed up. I know, this is still not a real solution, but hopefully practical enough to unblock you for now.

will-bartlett commented 2 years ago

@kevindixon I work with @rayluo - I work on the web service. I think the short answer is that the public Microsoft Identity Platform does not support any reliable way for relying parties to be assured an authentication prompt occurred. To meet this requirement for id_tokens, we would want to use a pattern similar to max_age, where the relying party sends a parameter (either prompt=login or max_age=0) indicating reauth is required, and then the STS puts a claim in the token to demonstrate that this reauth has occurred, and the relying party check that claim. For this scenario, I think I would recommend the STS support a new claim (e.g. {"credential_was_entered":1}). Using a boolean claim like this avoids the issues with small time windows. Time windows as small as 60 seconds are likely to cause problems in practice - e.g. some users will have slow internet connections, or their VPN will cut out at just the wrong time - causing spurious failures. So, I would strongly caution against trying to make the feature work in a hacky way by using small time windows. Then, instead of treating this issue like a bug report or question then, let's treat it like a new feature request. The scenario (RP-verifiable re-auth requirement) is sensible enough - we've actually encountered and solved this problem internally before - if you were to look at the OneDrive Vault, this feature uses exactly this pattern - one reauth per "vault unlock" event, even if the user has authed in a different context just a few seconds before.

kevindixon commented 2 years ago

Would you mind tell us what your regulated environment is? Could it be, for example, a financial or biomedical sector in UK? Having this kind of extra information may help our planning.

@rayluo the regulation in question is 21 CFR part 11 for the US FDA.

kevindixon commented 2 years ago

@will-bartlett I'm not entirely sure I understand the subtly of what you are proposing WRT integration with the STS, but you are proposing sounds fine to me with one exception - why introduce a new claim when this is already all covered by OIDC max_age behaviour? That is, when max_age is used, the issued token must include an auth_time claim. You can then leave it to the client to decide when this is fresh enough for them. [I agree that windows as short as 60 seconds are problematic BTW].

I'm not convinced there isn't a bug here (as well) given that the "last end-user authentication" datetime is actually "last end-user password entry" - i.e. ignores MFA which is actively misleading....

rayluo commented 10 months ago

Since those underlying subtlety might not be addressed anytime soon, perhaps we will have to settle on workaround on the client side. Re-reading the "expected behavior" from @ineesalmeida 's initial message, I think the "I am asked to re-enter my password and MFA and I can login" goal can be achieved by catching that RuntimeError exception in your app, and then present a helpful guidance to user such as "please start a new sign-in and try to finish it fast".

MSAL Python may be changed to throw a more specific exception such as ExpiredUserAuthenticationError. Will that help, @kevindixon ?

kevindixon commented 10 months ago

@rayluo a more specific error would certainly help - at least then we could reliably catch this situation and either re-try or present the user with some choices.

(TBH we've long since moved on with other solutions outside of the Microsoft platform for these requirements)

bgavrilMS commented 10 months ago

We don't recommend using max_age to handle this scenario.

There is a conditional access policy that handles this scenario by requireing re-auth all the time - https://learn.microsoft.com/en-us/azure/active-directory/conditional-access/howto-conditional-access-session-lifetime#require-reauthentication-every-time

kevindixon commented 7 months ago

@bgavrilMS Not sure I can see how conditional access can support this scenario, at least looking at the documentation you link? i.e. requiring re-authentication for specific application or API level actions?

bgavrilMS commented 7 months ago

I think this problem can be solved using CA Context. The best sample that showcases this is with MSAL Java, but at a high level the scenario is:

you define a CA rule that requires re-authentication every X hours. I belive 1h is min. You may combine this with requiring MFA or other CA.

instead of restricting the CA rule to some apps or to some users, you use a CA context, e.g. "c1"
programmatically, when the user performs a critical action (e.g. access some critical page), you check that the ID token has the "c1" claim.
if the id token doesn't have the claim, you challenge the user (ask them to reauth + add a claims challenge, which is a json string)

This is the standard pattern for ensuring one API requires MFA and one does not, but I think it should work with "sing-in frequency as well".

Btw @kevindixon - what is the scenario here? I'm curious as to why you want to enforce a strict reauthentication time. Note that client-only solutions are unlikely to work because the browser has SSO cookies which will cause the user to re-auth without typing password etc. Afaik, only the "Reauthentication" CA is reliable.

kevindixon commented 7 months ago

@bgavrilMS this is compliance with regulation (21 CFR 11) that requires explicit re-authentication when sensitive operations (such as altering auditable data records) are carried out. The "explicit" is important for non-repudiation - given it is possible, for instance, for someone at my laptop to carry out an operation and impersonate me whilst the SSO cookie (or re-auth window) is valid (in the absence of other controls such as a locking your laptop!).

I can't see the solution you mention fulfilling this requirement, given a 1-hour re-auth window, alas.

We've long since moved on to other solutions BTW, when it became obvious AAD didn't seem to be able to fulfil this requirement.

rayluo commented 7 months ago

I understood that the customer has already moved on to other solutions (sad to see you go, Kevin). But perhaps MSAL (Python at least) can implement a workaround as I mentioned in Sep in the near future, so that we will technically be able to close this issue. Do you agree, @bgavrilMS ?

bgavrilMS commented 7 months ago

@rayluo - will start a conversation with the server side folks. Marking this as external for now. It isn't strictly speaking a client side issue. Let's come back to it after the holidays.