dpgaspar / Flask-AppBuilder

Simple and rapid application development framework, built on top of Flask. includes detailed security, auto CRUD generation for your models, google charts and much more. Demo (login with guest/welcome) - http://flaskappbuilder.pythonanywhere.com/
BSD 3-Clause "New" or "Revised" License
4.57k stars 1.34k forks source link

Can't implement/authenticate Custom Provider for Airflow using FAB #2040

Open h00jraq opened 1 year ago

h00jraq commented 1 year ago

Hello, I'm quite new in FAB and in Authentication, so please forgive me if I made any stupid mistake here.

Environment

Apache Airflow 2.5.0 deployed via helm chart on AKS Cluster Flask-Appbuilder version: 4.3.0

I'm trying to authorize using custom provider, similar to what was done here: https://github.com/dpgaspar/Flask-AppBuilder/issues/608#issuecomment-845264649 I was trying to follow that approach but without success.

I have 3 problems for now:

  1. I can't login to Airflow - I've pasted logs at the bottom. Provider is initalized with correct provider name, I can see that valid request is send to the identity server but then it fails with message Error authorizing OAuth access token: invalid_client: Client authentication failed. Thing is, Client Id/secret are 100% correct but maybe they are not taken from remote app? Could not print them in oauth_user_info, even if I tried, nothing was showing in webserver logs.

  2. Unfortunately, it looks like none of the method (except init if I add one) is called from class pointed out by FAB_SECURITY_MANAGER_CLASS. I have added few print statements and they are just not in the logs. I was trying to use get_oauth_user_info() but result was the same, no print statements and ended with same errors:

  3. I can see that class is used because all values from my custom provider are used except redirect URI, which is always pointing to url/oauth-authorized/provider-name and I can't override that. I see that there is @expore(/oauth-authorized/provider) in the code but I thought that my redirect_uri should be used.

#!/usr/bin/python
# -*- coding: utf-8 -*-
import os
from airflow.configuration import conf
from airflow.utils.log.logging_mixin import LoggingMixin
from flask_appbuilder.security.manager import AUTH_OAUTH
from airflow.www.security import AirflowSecurityManager

SQLALCHEMY_DATABASE_URI = conf.get('core', 'SQL_ALCHEMY_CONN')
basedir = os.path.abspath(os.path.dirname(__file__))
CSRF_ENABLED = True
AUTH_TYPE = AUTH_OAUTH
AUTH_USER_REGISTRATION_ROLE = 'Public'
AUTH_USER_REGISTRATION = True

class CustomSecurityManager(AirflowSecurityManager):

    def oauth_user_info(self, provider, response=None):
        sm = self.appbuilder.sm
        if provider == 'pwc':
            response = sm.oauth_remotes[provider].get('info')
            data = response.json()
            self.log.debug('PWC response received : {0}'.format(provider))

            # remote_app = self.appbuilder.sm.oauth_remotes[provider]

            print 'something'
            print remote_app

            # me = remote_app.get("userinfo_endpoint")
            # me = self.appbuilder.sm.oauth_remotes[provider].get("userinfo_endpoint")
            # id_token = response.get("id_token")

            logging.error(me)
            return {
                'username': data['username'],
                'email': data.get('email', ''),
                'first_name': data.get('first_name', ''),
                'last_name': data.get('last_name', ''),
                }
        else:

            return {}

    def sync_roles(self):

        # Custom roles are given airflow website permission in AirflowSecurityManager.sync_roles()

        self.add_role('Guest')
        return super().sync_roles()

# a mapping from the values of `userinfo["role_keys"]` to a list of FAB roles

AUTH_ROLES_MAPPING = {'airflow_nonprod_admin': ['Admin'],
                      'airflow_viewer': ['Public']}

AUTH_ROLES_SYNC_AT_LOGIN = True

OAUTH_PROVIDERS = [{
    'name': 'my-provider',
    'token_key': 'access_token',
    'icon': 'fa-globe',
    'remote_app': {
        'client_id': 'urn:insightshub-test.pwcinternal.co.uk',
        'client_secret': 'xxxxxx',
        'scope': 'openid email profile',
        'issuer': 'https://example.com.com:443/openam/oauth2',
        'token_endpoint': 'https://example.com.com/openam/oauth2/access_token'
            ,
        'userinfo_endpoint': 'https://example.com/openam/oauth2/userinfo'
            ,
        'redirect_uri': 'https://example.com/oauth/callback',
        'access_token_url': 'https://example.com/openam/oauth2/access_token'
            ,
        'access_token_params': {'scope': 'openid email profile'},
        'authorize_url': 'https://lexample.com/openam/oauth2/authorize'
            ,
        'authorize_params': {'scope': 'openid email profile'},
        'jwks_uri': 'https://example.com/openam/oauth2/connect/jwk_uri'
            ,
        },
    }]

FAB_SECURITY_MANAGER_CLASS = 'webserver_config.CustomSecurityManager'

I'm redirected to my provider login page, trying to log with sso and it works but then I'm taken to the Airflow login page with message "The request to sign in was denied". Below logs are taken from Airflow-webserver. Problem is, client id/secret are 100% correct but maybe they are not taken from remote_app but from other place and are just ignored same as redirect_uri?

[2023-05-12T21:26:40.079+0000] {views.py:651} DEBUG - Authorized init
[2023-05-12T21:26:40.079+0000] {views.py:651} DEBUG - Authorized init
[2023-05-12T21:26:40.083+0000] {connectionpool.py:1007} DEBUG - Starting new HTTPS connection (1): [example.com:443](http://example.com:443/)
[2023-05-12T21:26:40.280+0000] {connectionpool.py:465} DEBUG - [https://examplecom:443](https://examplecom/) "POST /openam/oauth2/access_token HTTP/1.1" 401 77
[2023-05-12T21:26:40.284+0000] {views.py:659} ERROR - Error authorizing OAuth access token: invalid_client: Client authentication failed
[2023-05-12T21:26:40.284+0000] {views.py:659} ERROR - Error authorizing OAuth access token: invalid_client: Client authentication failed

Do I need switch to get_oauth_user_info and use decorator, as mentioned in the documentation? Can I somehow debug this locally? I've already wasted lot of time trying to get this to work and I fell lost....

expore commented 1 year ago

您好,我已收到您的邮件,待阅读后会尽快给您回复。

dpgaspar commented 1 year ago

Hi,

Will be hard for me to debug it like this, also it's not clear to me what kind of OAuth provider your using, and invalid_client: Client authentication failed can mean a bunch of things.

fedepad commented 1 year ago

Couple of things/questions to make sure:

To give you an example of what I did, please keep in mind that I use plain F.A.B. so I don't know what is the method name to override in the case you inherit from AirflowSecurityManager, my example will use plain F.A.B.

The config it was necessary for me (again, different provider than yours, so abstract accordingly!) was (in json, translate to python config accordingly):

    "OAUTH_PROVIDERS": [

        {
            "name": "provider",
            "icon": "fa-something",
            "token_key": "access_token",
            "remote_app": {
                "client_id": "myclient",
                "client_secret": "myclientsecret",
                "api_base_url": "http://some_url/protocol/openid-connect/",
                "client_kwargs": {
                    "scope": "email profile openid roles"
                },
                "jwks_uri":"http://some_url/protocol/openid-connect/certs",
                "access_token_url": "http://some_url/protocol/openid-connect/token",
                "authorize_url": "http://some_url/protocol/openid-connect/auth",
                "request_token_url": null
            }
        }

    ]

so, maybe yours would be along the following lines?

    "OAUTH_PROVIDERS": [

        {
            "name": "pwc",
            "icon": "fa-globe",
            "token_key": "access_token",
            "remote_app": {
                "client_id": "myclient",
                "client_secret": "myclientsecret",
                "api_base_url": "https://example.com/openam/oauth2/",
                "client_kwargs": {
                    "scope": "email profile openid roles"
                },
                "jwks_uri":"https://example.com/openam/oauth2/jwk_uri",  # double check
                "access_token_url": "https://example.com/openam/oauth2/access_token",
                "authorize_url": "https://example.com/openam/oauth2/authorize",
                "request_token_url": null
            }
        }

    ]

An example of my custom security manager (again, different provider than yours, extract info accordingly!) with plain F.A.B. currently running:

from flask_appbuilder.security.sqla.manager import SecurityManager

class CustomRoleBasedSecurityManager(SecurityManager):
    def _get_oauth_user_info(self, provider, response=None):
        if provider in ["custom_provider_1", "custom_provider_2"]:
            me = self.appbuilder.sm.oauth_remotes[provider].get(
                "userinfo"
            )
            me.raise_for_status()
            data = me.json()            

            client_id = ""
            for oauth_provider in self.appbuilder.sm.oauth_providers:
                if provider == oauth_provider["name"]:
                    client_id = oauth_provider["remote_app"]["client_id"]
                    break
            userinfo_return_dict = {
                "username": data.get("preferred_username", ""),
                "first_name": data.get("given_name", ""),
                "last_name": data.get("family_name", ""),
                "email": data.get("email", ""),
                "role_keys": data.get("roles", data["resource_access"][client_id].get("roles", [])),
            }
            return userinfo_return_dict
        else:
            return {}
    get_oauth_user_info = _get_oauth_user_info

Your case will change accordingly, e.g. how to get the roles, the keys that contain the userinfo, and you don't need the check in a list of providers and for loop to select the right client depending on the provider since you say your provider is just "pwc".