apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.71k stars 14.21k forks source link

Airflow 2.1.0 Oauth for google Too Many Redirects b/c Google User does not have Role #16783

Closed mdcsaenz closed 3 years ago

mdcsaenz commented 3 years ago

The issue is similar to this ticket 16587 and 14829 however I have an updated airflow version AND updated packages than the ones suggested here and I am still getting the same outcome. When using google auth in airflow and attempting to sign in, we get an ERR_TOO_MANY_REDIRECTS. I know what causes the symptom of this, but hoping to find a resolution of keeping a Role in place to avoid the REDIRECTS.

What happened: When using google auth in airflow and attempting to sign in, we get an ERR_TOO_MANY_REDIRECTS.

What you expected to happen: I expect to log in as my user and it assigns a default Role of Viewer at the very least OR uses our mappings in web_server config python file. But the Role is blank in Database.

We realized that we get stuck in the loop, b/c the user will be in the users table in airflow but without a Role (its literally empty). Therefore it goes from the /login to /home to /login to /home over and over again.

How to reproduce it:

I add the Admin role in the database for my user, and the page that has the redirects refreshes and lets me in to the Airflow UI. However, when I sign out and signin in again, my users Role is then erased and it starts the redirect cycle again.

As you can see there is no Role (this happens when I attempt to login)

id | username                     | email                   | first_name | last_name | roles
===+==============================+=========================+============+===========+======
1  | admin                        | admin@example.com       | admin      | admin     | Admin
2  | google_############ | msaenz@company.com | Cat     | Says     | 

I run the command: airflow users add-role -r Admin -u google_#################

Then the page takes me to the UI and the table now looks like this:

id | username                     | email                   | first_name | last_name | roles
===+==============================+=========================+============+===========+======
1  | admin                        | admin@example.com       | admin      | admin     | Admin
2  | google_############ | msaenz@company.com | Cat     | Says     |  Admin

How often does this problem occur? Once? Every time etc? This occurs all the time

Here is the webserver_config.py

  import os
      from flask_appbuilder.security.manager import AUTH_OAUTH
      AUTH_TYPE = AUTH_OAUTH
      AUTH_ROLE_ADMIN="Admin"
      AUTH_USER_REGISTRATION = False
      AUTH_USER_REGISTRATION_ROLE = "Admin"
      OIDC_COOKIE_SECURE = False
      CSRF_ENABLED = False
      WTF_CSRF_ENABLED = True
      AUTH_ROLES_MAPPING = {"Engineering": ["Ops"],"Admins": ["Admin"]}
      AUTH_ROLES_SYNC_AT_LOGIN = True
      OAUTH_PROVIDERS = [
          {
              'name': 'google', 'icon': 'fa-google',
              'token_key': 'access_token',
              'remote_app': {
                  'client_id': '#####################.apps.googleusercontent.com',
                  'client_secret': '######################',
                  'api_base_url': 'https://www.googleapis.com/oauth2/v2/',
                  'whitelist': ['@company.com'],  # optional
                  'client_kwargs': {
                      'scope': 'email profile'
                  },
                  'request_token_url': None,
                  'access_token_url': 'https://accounts.google.com/o/oauth2/token',
                  'authorize_url': 'https://accounts.google.com/o/oauth2/auth'},
          }
      ]

Here is the pip freeze:

adal==1.2.7
alembic==1.6.2
amqp==2.6.1
anyio==3.2.1
apache-airflow==2.1.0
apache-airflow-providers-amazon==1.4.0
apache-airflow-providers-celery==1.0.1
apache-airflow-providers-cncf-kubernetes==1.2.0
apache-airflow-providers-docker==1.2.0
apache-airflow-providers-elasticsearch==1.0.4
apache-airflow-providers-ftp==1.1.0
apache-airflow-providers-google==3.0.0
apache-airflow-providers-grpc==1.1.0
apache-airflow-providers-hashicorp==1.0.2
apache-airflow-providers-http==1.1.1
apache-airflow-providers-imap==1.0.1
apache-airflow-providers-microsoft-azure==2.0.0
apache-airflow-providers-mysql==1.1.0
apache-airflow-providers-postgres==1.0.2
apache-airflow-providers-redis==1.0.1
apache-airflow-providers-sendgrid==1.0.2
apache-airflow-providers-sftp==1.2.0
apache-airflow-providers-slack==3.0.0
apache-airflow-providers-sqlite==1.0.2
apache-airflow-providers-ssh==1.3.0
apispec==3.3.2
appdirs==1.4.4
argcomplete==1.12.3
async-generator==1.10
attrs==20.3.0
azure-batch==10.0.0
azure-common==1.1.27
azure-core==1.13.0
azure-cosmos==3.2.0
azure-datalake-store==0.0.52
azure-identity==1.5.0
azure-keyvault==4.1.0
azure-keyvault-certificates==4.2.1
azure-keyvault-keys==4.3.1
azure-keyvault-secrets==4.2.0
azure-kusto-data==0.0.45
azure-mgmt-containerinstance==1.5.0
azure-mgmt-core==1.2.2
azure-mgmt-datafactory==1.1.0
azure-mgmt-datalake-nspkg==3.0.1
azure-mgmt-datalake-store==0.5.0
azure-mgmt-nspkg==3.0.2
azure-mgmt-resource==16.1.0
azure-nspkg==3.0.2
azure-storage-blob==12.8.1
azure-storage-common==2.1.0
azure-storage-file==2.1.0
Babel==2.9.1
bcrypt==3.2.0
billiard==3.6.4.0
blinker==1.4
boto3==1.17.71
botocore==1.20.71
cached-property==1.5.2
cachetools==4.2.2
cattrs==1.0.0
celery==4.4.7
certifi==2020.12.5
cffi==1.14.5
chardet==3.0.4
click==7.1.2
clickclick==20.10.2
cloudpickle==1.4.1
colorama==0.4.4
colorlog==5.0.1
commonmark==0.9.1
contextvars==2.4
croniter==1.0.13
cryptography==3.4.7
dask==2021.3.0
dataclasses==0.7
defusedxml==0.7.1
dill==0.3.1.1
distlib==0.3.1
distributed==2.19.0
dnspython==1.16.0
docker==3.7.3
docker-pycreds==0.4.0
docutils==0.17.1
elasticsearch==7.5.1
elasticsearch-dbapi==0.1.0
elasticsearch-dsl==7.3.0
email-validator==1.1.2
eventlet==0.31.0
filelock==3.0.12
Flask==1.1.2
Flask-AppBuilder==3.3.0
Flask-Babel==1.0.0
Flask-Caching==1.10.1
Flask-JWT-Extended==3.25.1
Flask-Login==0.4.1
Flask-OpenID==1.2.5
Flask-SQLAlchemy==2.5.1
Flask-WTF==0.14.3
flower==0.9.7
gevent==21.1.2
google-ads==4.0.0
google-api-core==1.26.3
google-api-python-client==1.12.8
google-auth==1.30.0
google-auth-httplib2==0.1.0
google-auth-oauthlib==0.4.4
google-cloud-automl==2.3.0
google-cloud-bigquery==2.16.0
google-cloud-bigquery-datatransfer==3.1.1
google-cloud-bigquery-storage==2.4.0
google-cloud-bigtable==1.7.0
google-cloud-container==1.0.1
google-cloud-core==1.6.0
google-cloud-datacatalog==3.1.1
google-cloud-dataproc==2.3.1
google-cloud-dlp==1.0.0
google-cloud-kms==2.2.0
google-cloud-language==1.3.0
google-cloud-logging==2.3.1
google-cloud-memcache==0.3.0
google-cloud-monitoring==2.2.1
google-cloud-os-login==2.1.0
google-cloud-pubsub==2.4.2
google-cloud-redis==2.1.0
google-cloud-secret-manager==1.0.0
google-cloud-spanner==1.19.1
google-cloud-speech==1.3.2
google-cloud-storage==1.38.0
google-cloud-tasks==2.2.0
google-cloud-texttospeech==1.0.1
google-cloud-translate==1.7.0
google-cloud-videointelligence==1.16.1
google-cloud-vision==1.0.0
google-cloud-workflows==0.3.0
google-crc32c==1.1.2
google-resumable-media==1.2.0
googleapis-common-protos==1.53.0
graphviz==0.16
greenlet==1.1.0
grpc-google-iam-v1==0.12.3
grpcio==1.37.1
grpcio-gcp==0.2.2
gunicorn==20.1.0
h11==0.12.0
HeapDict==1.0.1
httpcore==0.13.6
httplib2==0.17.4
httpx==0.18.2
humanize==3.5.0
hvac==0.10.11
idna==2.10
immutables==0.15
importlib-metadata==1.7.0
importlib-resources==1.5.0
inflection==0.5.1
iso8601==0.1.14
isodate==0.6.0
itsdangerous==1.1.0
Jinja2==2.11.3
jmespath==0.10.0
json-merge-patch==0.2
jsonschema==3.2.0
kombu==4.6.11
kubernetes==11.0.0
lazy-object-proxy==1.4.3
ldap3==2.9
libcst==0.3.18
lockfile==0.12.2
Mako==1.1.4
Markdown==3.3.4
MarkupSafe==1.1.1
marshmallow==3.12.1
marshmallow-enum==1.5.1
marshmallow-oneofschema==2.1.0
marshmallow-sqlalchemy==0.23.1
msal==1.11.0
msal-extensions==0.3.0
msgpack==1.0.2
msrest==0.6.21
msrestazure==0.6.4
mypy-extensions==0.4.3
mysql-connector-python==8.0.22
mysqlclient==2.0.3
numpy==1.19.5
oauthlib==2.1.0
openapi-schema-validator==0.1.5
openapi-spec-validator==0.3.0
packaging==20.9
pandas==1.1.5
pandas-gbq==0.14.1
paramiko==2.7.2
pendulum==2.1.2
pep562==1.0
plyvel==1.3.0
portalocker==1.7.1
prison==0.1.3
prometheus-client==0.8.0
proto-plus==1.18.1
protobuf==3.16.0
psutil==5.8.0
psycopg2-binary==2.8.6
pyarrow==3.0.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
pydata-google-auth==1.2.0
Pygments==2.9.0
PyJWT==1.7.1
PyNaCl==1.4.0
pyOpenSSL==19.1.0
pyparsing==2.4.7
pyrsistent==0.17.3
pysftp==0.2.9
python-daemon==2.3.0
python-dateutil==2.8.1
python-editor==1.0.4
python-http-client==3.3.2
python-ldap==3.3.1
python-nvd3==0.15.0
python-slugify==4.0.1
python3-openid==3.2.0
pytz==2021.1
pytzdata==2020.1
PyYAML==5.4.1
redis==3.5.3
requests==2.25.1
requests-oauthlib==1.1.0
rfc3986==1.5.0
rich==9.2.0
rsa==4.7.2
s3transfer==0.4.2
sendgrid==6.7.0
setproctitle==1.2.2
six==1.16.0
slack-sdk==3.5.1
sniffio==1.2.0
sortedcontainers==2.3.0
SQLAlchemy==1.3.24
SQLAlchemy-JSONField==1.0.0
SQLAlchemy-Utils==0.37.2
sshtunnel==0.1.5
starkbank-ecdsa==1.1.0
statsd==3.3.0
swagger-ui-bundle==0.0.8
tabulate==0.8.9
tblib==1.7.0
tenacity==6.2.0
termcolor==1.1.0
text-unidecode==1.3
toolz==0.11.1
tornado==6.1
typing==3.7.4.3
typing-extensions==3.7.4.3
typing-inspect==0.6.0
unicodecsv==0.14.1
uritemplate==3.0.1
urllib3==1.25.11
vine==1.3.0
virtualenv==20.4.6
watchtower==0.7.3
websocket-client==0.59.0
Werkzeug==1.0.1
WTForms==2.3.3
zict==2.0.0
zipp==3.4.1
zope.event==4.5.0
zope.interface==5.4.0

Thanks in advance.

jedcunningham commented 3 years ago

AUTH_USER_REGISTRATION_ROLE only applies if AUTH_USER_REGISTRATION is true.

Looks like the FAB google oauth provider doesn't support AUTH_ROLE_MAPPING by default: https://github.com/dpgaspar/Flask-AppBuilder/blob/95947e84e04a999a474dfe8620fb0f36d71f0467/flask_appbuilder/security/manager.py#L585-L590

You'd need a custom security manager to return role_keys, for example like is done in the OP of #14829. (I'm sure FAB would love to have support for this with the google oauth provider, so consider contributing it back if you get it working).

mdcsaenz commented 3 years ago

AUTH_USER_REGISTRATION_ROLE only applies if AUTH_USER_REGISTRATION is true.

Looks like the FAB google oauth provider doesn't support AUTH_ROLE_MAPPING by default: https://github.com/dpgaspar/Flask-AppBuilder/blob/95947e84e04a999a474dfe8620fb0f36d71f0467/flask_appbuilder/security/manager.py#L585-L590

You'd need a custom security manager to return role_keys, for example like is done in the OP of #14829. (I'm sure FAB would love to have support for this with the google oauth provider, so consider contributing it back if you get it working).

Ah okay, I will see if that works and then close the ticket if that is the case. But it definitely erased the Role even with AUTH_USER_REGISTRATION_ROLE and AUTH_USER_REGISTRATION are both false as well. I made AUTH_USER_ROLE False as well.

jedcunningham commented 3 years ago

But it definitely erased the Role

Yep, that's the behavior with AUTH_ROLES_SYNC_AT_LOGIN and no matching roles in AUTH_ROLE_MAPPING. You could also disable AUTH_ROLES_SYNC_AT_LOGIN and manage the roles yourself.

ashb commented 3 years ago

Is this an Airflow bug, or a FAB bug?

jedcunningham commented 3 years ago

FAB.

ashb commented 3 years ago

This is likely a "wont Fix" from airflow's side I'm afraid then.

jedcunningham commented 3 years ago

I haven't poked around yet, but we may be able to handle the 'no role' scenario more gracefully on our side. Let me take a stab at that before we close this.

ashb commented 3 years ago

@jedcunningham Cool