Closed kurian-dm closed 1 year ago
Hi,
Any updates on this issues
Thanks
Can you create a runnable example? I'm not familiar with airflow.
Hi,
You will need an AWS account. In that create a VPC. Install a EKS cluster and install Keycloak which runs on an ECS container. Airflow has to be installed on Kubernetes using helm charts.
If this is not possible for you, can we have a screen sharing session or I can even share additional logs. I have enabled FAB additional logging for airlfow. This shows that authorization is happening and then in authlib it detects a state mismatch.
Thanks
Hi, I've seen such an issue somewhere, it was caused by session not set properly. Can you check your session based on secure cookie? Just check if the server can get the session value, and if the browser contains those session data.
I am reasonably sure this is a bug.
I am using Django as a client against a sever that's not Google, Twitter or Facebook and I'm getting CSRF session mismatch errors when calling authorize_access_token()
.
In my case this appears to be coming from framework.get_state_data()
as it is looking for a key in this form f'_state_{self.name}_{state}'
when request.session
doesn't have a key in that form.
My guess is when using this library with one of a handful of known OAuth providers the key for the CSRF token in the request.session
is in the form f'_state_{self.name}_{state}'
and so things might be able to work.
But in a Django context when using OAuthlb as a client there is a bug.
It is assumed the token is in the
request.session
and is referenced by a key in the formf'_state_{self.name}_{state}'
If the CSRF token is in my request.session
at all, it's going to be keyed as 'state'
.
We can see in the code snippets below...
authorize_access_token
sets the session key in params to 'state'
then calls get_state_data
passing request.session
and the value of the session token.
get_state_data
tries to recover a session token value using a key in the form f'_state_{self.name}_{state}'
. In my case it will always return None
.
Then authorize_access_token
calls _format_state_params(state_data, params)
where state_data
is None
and our MismatchingStateError
is raised.
Apps.py
class DjangoOAuth2App(DjangoAppMixin, OAuth2Mixin, OpenIDMixin, BaseApp):
client_cls = OAuth2Session
def authorize_access_token(self, request, **kwargs):
"""Fetch access token in one step.
:param request: HTTP request instance from Django view.
:return: A token dict.
"""
if request.method == 'GET':
error = request.GET.get('error')
if error:
description = request.GET.get('error_description')
raise OAuthError(error=error, description=description)
params = {
'code': request.GET.get('code'),
'state': request.GET.get('state'),
}
else:
params = {
'code': request.POST.get('code'),
'state': request.POST.get('state'),
}
state_data = self.framework.get_state_data(request.session, params.get('state'))
self.framework.clear_state_data(request.session, params.get('state'))
params = self._format_state_params(state_data, params)
token = self.fetch_access_token(**params, **kwargs)
if 'id_token' in token and 'nonce' in state_data:
userinfo = self.parse_id_token(token, nonce=state_data['nonce'])
token['userinfo'] = userinfo
return token
framework_integration.py
def get_state_data(self, session, state):
key = f'_state_{self.name}_{state}'
if self.cache:
value = self._get_cache_data(key)
else:
value = session.get(key)
if value:
return value.get('data')
return None
sync_app.py
@staticmethod
def _format_state_params(state_data, params):
if state_data is None:
raise MismatchingStateError()
code_verifier = state_data.get('code_verifier')
if code_verifier:
params['code_verifier'] = code_verifier
redirect_uri = state_data.get('redirect_uri')
if redirect_uri:
params['redirect_uri'] = redirect_uri
return params
I have not put much time into thinking how to fix this, but the OAuth2 client documentation I've been reading suggests that the CSRF token is called 'state'
so I'm not entirely sure why there's a munged key in the mix here at all.
I might be able to supply Django code if you're interested in replicating the error. But this bug has taken up a significant amount of time to characterise and find so I am now in a time crunch to make get things working.
for later reference, I used the instructions here to trigger the above scenario.
@bradbase here is the demo for django: https://github.com/authlib/demo-oauth-client/tree/master/django-google-login
It works well
@lepture Thank you.
Your example looks like it would work very well but it's optimised for logging against Google and I need to auth against Xero.
Xero has particular needs for its header and, as mentioned above, calls "state", "state". I have not seen a way to configure authlib finely enough to succeed.
Cheers
@bradbase state
is added automatically. It is a part of the OAuth 2.0 logic.
@kurian-dm please make sure your session works. Same as https://github.com/lepture/authlib/issues/518
@lepture Thank you.
Your example looks like it would work very well but it's optimised for logging against Google and I need to auth against Xero.
Xero has particular needs for its header and, as mentioned above, calls "state", "state". I have not seen a way to configure authlib finely enough to succeed.
Cheers
Did you ever get this resolved?
Describe the bug
It happens when using authlib to configure Keycloak for Airflow. Everything works perfectly up until redirecting back from Keycloak to Airflow.
Error Stacks
To Reproduce
A minimal example to reproduce the behavior: This is my code: import os import json import logging
Expected behavior
Airflow redirects user to keycloak authentication site as expected. Upon finishing authenticating and getting redirected back to airflow, CSRF Warning! State not equal in request and response occur.
Environment:
Airflow runs on kubernetes cluster and keycloak runs on ECS fargate container within the same VPC in AWS.
Additional context
Tried on different browsers and in incognito mode, but it still does not work.