Azure-Samples / ms-identity-python-webapp

A Python web application calling Microsoft graph that is secured using the Microsoft identity platform
MIT License
279 stars 133 forks source link

State mismatch in application without database #125

Closed guushoekman closed 5 months ago

guushoekman commented 5 months ago

I'm getting a state mismatch error when attempting to sign in:

2024-01-19T11:59:28.146571672Z Encountered state mismatch: KbRmQjtvXDGUePkO vs pWhAUmQvHzysfGND
2024-01-19T11:59:28.146609776Z Traceback (most recent call last):
2024-01-19T11:59:28.146615977Z   File "/tmp/8dc18e59ae9b15e/antenv/lib/python3.10/site-packages/identity/web.py", line 151, in complete_log_in
2024-01-19T11:59:28.146620977Z     ).acquire_token_by_auth_code_flow(auth_flow, auth_response)
2024-01-19T11:59:28.146625678Z   File "/tmp/8dc18e59ae9b15e/antenv/lib/python3.10/site-packages/msal/application.py", line 949, in acquire_token_by_auth_code_flow
2024-01-19T11:59:28.146630478Z     response = _clean_up(self.client.obtain_token_by_auth_code_flow(
2024-01-19T11:59:28.146635379Z   File "/tmp/8dc18e59ae9b15e/antenv/lib/python3.10/site-packages/msal/application.py", line 153, in obtain_token_by_auth_code_flow
2024-01-19T11:59:28.146640079Z     return super(_ClientWithCcsRoutingInfo, self).obtain_token_by_auth_code_flow(
2024-01-19T11:59:28.146644480Z   File "/tmp/8dc18e59ae9b15e/antenv/lib/python3.10/site-packages/msal/oauth2cli/oidc.py", line 205, in obtain_token_by_auth_code_flow
2024-01-19T11:59:28.146649180Z     result = super(Client, self).obtain_token_by_auth_code_flow(
2024-01-19T11:59:28.146653681Z   File "/tmp/8dc18e59ae9b15e/antenv/lib/python3.10/site-packages/msal/oauth2cli/oauth2.py", line 541, in obtain_token_by_auth_code_flow
2024-01-19T11:59:28.146658281Z     raise ValueError("state mismatch: {} vs {}".format(
2024-01-19T11:59:28.146662682Z ValueError: state mismatch: KbRmQjtvXDGUePkO vs pWhAUmQvHzysfGND

I can get to the sign in page and go through the signing in process, but I then get the error and am redirected back to /login. I have the example deployed on azure exactly as is except for changing app_config.py. In that file I saw:

# Tells the Flask-session extension to store sessions in the filesystem
SESSION_TYPE = "filesystem"
# Using the file system will not work in most production systems,
# it's better to use a database-backed session store instead.

The thing is that I have a very simple application with which I want to allow a user to sign in and, after they are, run a process for which I don't need a database.

Is there a recommended way to implement this if no database is used?

rayluo commented 5 months ago

I have the example deployed on azure exactly as is except for changing app_config.py. In that file I saw:

# Tells the Flask-session extension to store sessions in the filesystem
SESSION_TYPE = "filesystem"
# Using the file system will not work in most production systems,
# it's better to use a database-backed session store instead.

It is unclear whether you changed that app_config.py file or not, and if so, what was your change.

The thing is that I have a very simple application with which I want to allow a user to sign in and, after they are, run a process for which I don't need a database.

Is there a recommended way to implement this if no database is used?

The mentioning of database in app_config.py was a suggestion. But if you keep its default value "filesystem", it would still work. We have tested it that way.

I'm getting a state mismatch error when attempting to sign in

Could it be possible that you also modified this sample in some other way? Also, what is your configuration (you do not need to tell us your app's client_secret)? What is the output of pip list in your environment?

guushoekman commented 5 months ago

Thank you for the quick and helpful reply @rayluo! My app_config.py looks like this:

import os

AUTHORITY = os.getenv("AUTHORITY")
CLIENT_ID = os.getenv("CLIENT_ID")
CLIENT_SECRET = os.getenv("CLIENT_SECRET")
REDIRECT_PATH = "/auth/redirect"
SCOPE = ["User.Read"]
SESSION_TYPE = "filesystem"

My environment variables are these (I changed some values to zeros to not share them publicly):

[
  {
    "name": "AUTHORITY",
    "value": "https://login.microsoftonline.com/00000000-0000-0000-0000-000000000000",
    "slotSetting": false
  },
  {
    "name": "CLIENT_ID",
    "value": "00000000-0000-0000-0000-000000000000",
    "slotSetting": false
  },
  {
    "name": "CLIENT_SECRET",
    "value": "0000000000000000000000000000000000000000",
    "slotSetting": false
  },
  {
    "name": "SCM_DO_BUILD_DURING_DEPLOYMENT",
    "value": "1",
    "slotSetting": false
  }
]

You mentioned pip list, which made me realise that I have been testing out different things so perhaps I've installed some packages which cause conflicts. Here is the output of pip list:

Package              Version
-------------------- ----------
appsvc-code-profiler 1.0.0
blinker              1.7.0
cachelib             0.10.2
certifi              2023.11.17
cffi                 1.16.0
charset-normalizer   3.3.2
click                8.1.7
cryptography         41.0.7
debugpy              1.6.7
distlib              0.3.7
filelock             3.12.2
Flask                3.0.1
Flask-Session        0.5.0
gunicorn             20.1.0
identity             0.3.2
idna                 3.6
itsdangerous         2.1.2
Jinja2               3.1.3
markdown-it-py       3.0.0
MarkupSafe           2.1.3
mdurl                0.1.2
msal                 1.26.0
objprint             0.2.2
orjson               3.8.1
pip                  23.0.1
platformdirs         3.10.0
psutil               5.9.5
pycparser            2.21
Pygments             2.16.1
PyJWT                2.8.0
python-dotenv        0.21.1
requests             2.31.0
rich                 13.5.2
setuptools           65.5.0
subprocess32         3.5.4
urllib3              2.1.0
virtualenv           20.24.2
vizplugins           0.1.3
viztracer            0.15.6
Werkzeug             3.0.1
wheel                0.40.0

What I have noticed is that after I go through the sign in process I'm redirected to /login. However, after signing in, when I change the URL by removing the /login part (so going to the index page), I do see my username. So it does recognise I am signed in. Only once though, because when I refresh the page I no longer see my name and instead get redirected to /login.

Something I'm confused about though. You write:

The mentioning of database in app_config.py was a suggestion. But if you keep its default value "filesystem", it would still work.

But the app_config.py file mentions:

Using the file system will not work in most production systems

Saying it will not work on most production systems reads to me more like a requirement than a suggestion, but just to confirm: using the file system for sessions should work when deployed as an azure web app?

rayluo commented 5 months ago

Something I'm confused about though. You write:

The mentioning of database in app_config.py was a suggestion. But if you keep its default value "filesystem", it would still work.

But the app_config.py file mentions:

Using the file system will not work in most production systems

Saying it will not work on most production systems reads to me more like a requirement than a suggestion, but just to confirm: using the file system for sessions should work when deployed as an azure web app?

File system does not scale well, database does, that's why it is a suggestion to use database in production. And if you happen to build a popular website that has busy workload, file system may not keep up with your workload, your website may feel unresponsive. At that point, using database would become a requirement - required by your workload. But until then, file system is functionally adequate. We tested this sample app with its default file-system setting and it should work.

At the least, the file-system VS database shall not result in different behavior if you are deploying to only one server.

Which brings me to my next question. How many web servers are you currently using in your setup? I ask this because,

My environment variables are these (I changed some values to zeros to not share them publicly):

[
  {
    "name": "AUTHORITY",
    "value": "https://login.microsoftonline.com/00000000-0000-0000-0000-000000000000",
    "slotSetting": false
  },

the shape of your configuration file is different than an .env, so you are probably using some sort of cluster solution, which provisions multiple web servers. And since all of those web servers use file system to store sessions, they do not share the same session pool. It is therefore likely that your first request successfully signed in on machine A, but then your subsequent request randomly landed on machine B whose file system does not contain your session in machine A. In such a case, yep, you are required to use database among those machines to serve as a unified session storage.

Thank you for the quick and helpful reply @rayluo!

By the way, you think my previous response within 8 hours was quick? How about this new response with 48 minutes, in an early Saturday morning in my timezone? :-) That said, now I need to catch some sleep. Yawn.

guushoekman commented 5 months ago

How many web servers are you currently using in your setup?

It is therefore likely that your first request successfully signed in on machine A, but then your subsequent request randomly landed on machine B whose file system does not contain your session in machine A. In such a case, yep, you are required to use database among those machines to serve as a unified session storage.

This was an excellent question and observation, as this is exactly what was happening. I reduced my service plan to one instance and then had no issues. Unfortunately I cannot do this on production, so I will add a database for unified session storage.

By the way, you think my previous response within 8 hours was quick? How about this new response with 48 minutes, in an early Saturday morning in my timezone?

8 hours is quick, 48 minutes is too quick, especially for a weekend! :sweat_smile: No but seriously, thank you very much for the quick but most of all useful responses! I really appreciated it!

guushoekman commented 5 months ago

Small update: in the end I will keep using the filesystem. This seems to work well in combination with using the affinity cookie to ensure users are sent to the same instance for subsequent requests.

ARR cleverly identifies the user by assigning them a special cookie (known as an affinity cookie), which allows the service to choose the right instance the user was using to serve subsequent requests made by that user. This means, a client establishes a session with an instance and it will keep talking to the same instance until his session has expired.

https://azure.github.io/AppService/2016/05/16/Disable-Session-affinity-cookie-(ARR-cookie)-for-Azure-web-apps.html

rayluo commented 5 months ago

Small update: in the end I will keep using the filesystem. This seems to work well in combination with using the affinity cookie to ensure users are sent to the same instance for subsequent requests.

ARR cleverly identifies the user by assigning them a special cookie (known as an affinity cookie), which allows the service to choose the right instance the user was using to serve subsequent requests made by that user. This means, a client establishes a session with an instance and it will keep talking to the same instance until his session has expired.

https://azure.github.io/AppService/2016/05/16/Disable-Session-affinity-cookie-(ARR-cookie)-for-Azure-web-apps.html

Thanks for sharing! Yep, affinity cookie is also another excellent solution. I'll add this learning into this sample's configuration file soon.

P.S.: This response came within 37 minutes. :-)