apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.83k stars 14.25k forks source link

cryptography.fernet.InvalidToken when attempting to access connections #16313

Closed lewismc closed 3 years ago

lewismc commented 3 years ago

Apache Airflow version: 2.1.0 (Helm chart) from main branch 5c7d758e24595c485553b0449583ff238114d47d

Kubernetes version (if you are using kubernetes) (use kubectl version):

Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.6+k3s1", GitCommit:"8d0432824a9fd9474b67138b7630c33f285d332f", GitTreeState:"clean", BuildDate:"2021-04-16T19:04:44Z", GoVersion:"go1.15.10", Compiler:"gc", Platform:"linux/amd64"}```

**Environment**:

- **Cloud provider or hardware configuration**:
- **OS** (e.g. from /etc/os-release): macOS Catalina 10.15.7 
- **Kernel** (e.g. `uname -a`): Darwin MT-207576 19.6.0 Darwin Kernel Version 19.6.0: Mon Apr 12 20:57:45 PDT 2021; root:xnu-6153.141.28.1~1/RELEASE_X86_64 x86_64
- **Install tools**: helm version.BuildInfo{Version:"v3.4.2", GitCommit:"23dd3af5e19a02d4f4baa5b2f242645a1a3af629", GitTreeState:"dirty", GoVersion:"go1.15.5"}
- **Others**:

**What happened**:

After installaing Airflow and port forwarding from kubectl, when I navigate to the **connections** dropdown menu, I get the following trace

Ooops!

Something bad has happened. Please consider letting us know by creating a bug report using GitHub.

Python version: 3.6.13 Airflow version: 2.1.0 Node: airflow-webserver-67d745bff9-5sm7w

Traceback (most recent call last): File "/home/airflow/.local/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app response = self.full_dispatch_request() File "/home/airflow/.local/lib/python3.6/site-packages/flask/app.py", line 1952, in full_dispatch_request rv = self.handle_user_exception(e) File "/home/airflow/.local/lib/python3.6/site-packages/flask/app.py", line 1821, in handle_user_exception reraise(exc_type, exc_value, tb) File "/home/airflow/.local/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise raise value File "/home/airflow/.local/lib/python3.6/site-packages/flask/app.py", line 1950, in full_dispatch_request rv = self.dispatch_request() File "/home/airflow/.local/lib/python3.6/site-packages/flask/app.py", line 1936, in dispatch_request return self.view_functionsrule.endpoint File "/home/airflow/.local/lib/python3.6/site-packages/flask_appbuilder/security/decorators.py", line 109, in wraps return f(self, *args, *kwargs) File "/home/airflow/.local/lib/python3.6/site-packages/flask_appbuilder/views.py", line 551, in list widgets = self._list() File "/home/airflow/.local/lib/python3.6/site-packages/flask_appbuilder/baseviews.py", line 1134, in _list page_size=page_size, File "/home/airflow/.local/lib/python3.6/site-packages/flask_appbuilder/baseviews.py", line 1033, in _get_list_widget page_size=page_size, File "/home/airflow/.local/lib/python3.6/site-packages/flask_appbuilder/models/sqla/interface.py", line 435, in query query_results = query.all() File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/query.py", line 3373, in all return list(self) File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 100, in instances cursor.close() File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 70, in exit with_traceback=exctb, File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 182, in raise raise exception File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 80, in instances rows = [proc(row) for row in fetch] File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 80, in rows = [proc(row) for row in fetch] File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/loading.py", line 601, in _instance state.manager.dispatch.load(state, context) File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/event/attr.py", line 322, in call fn(args, **kw) File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/mapper.py", line 3397, in _event_on_load instrumenting_mapper._reconstructor(state.obj()) File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/connection.py", line 150, in on_db_load if self.password: File "/home/airflow/.local/lib/python3.6/site-packages/sqlalchemy/orm/attributes.py", line 365, in get retval = self.descriptor.get(instance, owner) File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/connection.py", line 235, in get_password return fernet.decrypt(bytes(self._password, 'utf-8')).decode() File "/home/airflow/.local/lib/python3.6/site-packages/cryptography/fernet.py", line 194, in decrypt raise InvalidToken cryptography.fernet.InvalidToken


**What you expected to happen**:

I expected to see the Connections pane.

**How to reproduce it**:

git clone airflow && cd airflow/chart helm install airflow . -n airflow-test kubectl port-forward svc/airflow-webserver 8080:8080 --namespace airflow-test



Navigate to http://localhost:8080/connection/list/ and the trace appears. 
boring-cyborg[bot] commented 3 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!

jedcunningham commented 3 years ago

I'm not able to reproduce this. Do you get a base64 string when you run the "echo Fernet Key" command helm install prints? e.g:

echo Fernet Key: $(kubectl get secret --namespace airflow-test airflow-fernet-key -o jsonpath="{.data.fernet-key}" | base64 --decode)

What are you using for K8s? What version of Helm do you have? I assume a new release name doesn't help, e.g. "airflow" -> "test"?

jedcunningham commented 3 years ago

If you exec into the webserver pod and run echo $AIRFLOW__CORE__FERNET_KEY, does it match the "echo Fernet Key" command?

Gurulhu commented 3 years ago

@jedcunningham I'm not the OP but I'm encountering the exact same issue. I'm running the official helm chart (http://airflow.apache.org/docs/helm-chart/stable/index.html) on Amazon EKS, and my permissions are kinda iffy, but are mostly working by now. echo $AIRFLOW__CORE__FERNET_KEY on webserver matches the fenet key in secrets, and it's a valid fernet key (to be 100% sure I generated one and hardcodded it on the values and it still gives the same error.

I'm still debugging it and will update this issue if I find anything relevant.

Gurulhu commented 3 years ago

I think I found what the issue might be: Inspecting the code it fails at password decryption so the issue wasn't about having no token. My environment wasn't properly clean between releases and the postgres RDS got init'd with a random-generated fernet key A, and the current release went up with another random generated fernet key B without ever rotating the key in the RDS, as it's out of the helm delete scope. Wiping the RDS before spinning the chart up fixed it. (Test db, no useful data was harmed)

jedcunningham commented 3 years ago

Cool, glad you found it!

@lewismc, that was why I asked about a new release name actually, as with the default values the chart keeps the postgres PVC around and can cause this issue. Can you confirm if you have the same underlying issue?

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] commented 3 years ago

This issue has been closed because it has not received response from the issue author.

morningcloud commented 1 year ago

Hi @jedcunningham, I am facing the same issue with unknowingly leaving the default values for fernetKey in values.yaml so a random key was generated. We have the postgres DB in a persistent volume, but I don't want to reset the DB and clear all existing dag run history. Is there a way to fix the mismatch between the newly generated key in the airflow webserver and the existing one in the db?

jedcunningham commented 1 year ago

@morningcloud, the fernet key is used for things like connection and encrypted variables. You could recreate all of those and still keep your history, in theory.

danielmorales72 commented 1 year ago

@morningcloud I hit this issue after my Airflow helm update. I resolved it today without loosing any history data or whatever. The point is that you need to delete all of your existing airflow connections

  1. You will find them by looking at the PostgreSQL RDS
airflow db shell
SELECT conn_id FROM connection;
  1. Then iterate over them and delete, I used some bash for that
#!/bin/bash

declare -a arr=("airflow_db" "aws_default" "azure_batch_default" "azure_container_instances_default" "azure_cosmos_default" "azure_data_explorer_default" "azure_data_lake_default" "cassandra_default" "databricks_default" "dingding_default" "druid_broker_default" "druid_ingest_default" "elasticsearch_default" "emr_default" "facebook_default" "fs_default" "google_cloud_default" "hive_cli_default" "hiveserver2_default" "http_default" "kubernetes_default" "kylin_default" "livy_default" "local_mysql" "metastore_default" "mongo_default" "mssql_default" "mysql_default" "opsgenie_default" "pig_cli_default" "pinot_admin_default" "pinot_broker_default" "postgres_default" "presto_default" "qubole_default" "redis_default" "segment_default" "sftp_default" "spark_default" "sqlite_default" "sqoop_default" "ssh_default" "tableau_default" "vertica_default" "wasb_default" "webhdfs_default" "yandexcloud_default" "azure_default" "drill_default")

for i in "${arr[@]}"
do
    airflow connections delete ${i}
done
  1. In my case 4 connections where not deleted with bash and I had to connect to RDS directly and delete them from the table
airflow db shell
SELECT * FROM connection;
DELETE FROM connection WHERE id = <id from above>;

After that my connections list was empty and I could visit the GUI or check in CLI. I just added the databricks connection back airflow connections add .. as it was the only connection we were using.

morningcloud commented 1 year ago

Thanks @jedcunningham and @danielmorales72!

I resolved this with a long and probably unnecessary workaround before seeing those messages! I'll write it here in case for some reason someone had no other choice but to do this. I managed to get the old fernet key from a snapshot backup then did the following:

  1. exec into one of the running airflow webserver containers.
  2. run export AIRFLOW__CORE__FERNET_KEY=**current_**fernet_key,**old_**fernet_key note this will be the decoded keys not the base64 encoded values.
  3. Run airflow rotate_fernet_key to re-encrypt existing credentials with the new fernet key.
  4. Set fernet_key back to current_fernet_key by running export AIRFLOW__CORE__FERNET_KEY=current_fernet_key in the same container.
paguasmar commented 1 year ago

For me what worked was to do the following steps:

  1. Terminate your Airflow and Postgres docker
  2. Delete the folder db-data
  3. Restart your Airflow docker
laraib-sidd commented 8 months ago

Thanks @danielmorales72

But i think this would be a better approach:

airflow db shell
truncate table connections;

After this i could visit the connection from the GUI.

luis-fnogueira commented 6 months ago

@morningcloud I hit this issue after my Airflow helm update. I resolved it today without loosing any history data or whatever. The point is that you need to delete all of your existing airflow connections

  1. You will find them by looking at the PostgreSQL RDS
airflow db shell
SELECT conn_id FROM connection;
  1. Then iterate over them and delete, I used some bash for that
#!/bin/bash

declare -a arr=("airflow_db" "aws_default" "azure_batch_default" "azure_container_instances_default" "azure_cosmos_default" "azure_data_explorer_default" "azure_data_lake_default" "cassandra_default" "databricks_default" "dingding_default" "druid_broker_default" "druid_ingest_default" "elasticsearch_default" "emr_default" "facebook_default" "fs_default" "google_cloud_default" "hive_cli_default" "hiveserver2_default" "http_default" "kubernetes_default" "kylin_default" "livy_default" "local_mysql" "metastore_default" "mongo_default" "mssql_default" "mysql_default" "opsgenie_default" "pig_cli_default" "pinot_admin_default" "pinot_broker_default" "postgres_default" "presto_default" "qubole_default" "redis_default" "segment_default" "sftp_default" "spark_default" "sqlite_default" "sqoop_default" "ssh_default" "tableau_default" "vertica_default" "wasb_default" "webhdfs_default" "yandexcloud_default" "azure_default" "drill_default")

for i in "${arr[@]}"
do
    airflow connections delete ${i}
done
  1. In my case 4 connections where not deleted with bash and I had to connect to RDS directly and delete them from the table
airflow db shell
SELECT * FROM connection;
DELETE FROM connection WHERE id = <id from above>;

After that my connections list was empty and I could visit the GUI or check in CLI. I just added the databricks connection back airflow connections add .. as it was the only connection we were using.

This has really helped me. Thanks a lot!

muriloeduardo199 commented 3 weeks ago

I had this problem. In my case I use Docker, and I had to take the connections inside Postgres' database. I solved it as follows:

1-entering the postgres database that is in Docker, 2- Accessing the database 3- using the sql command to delete the connections: DELETE FROM connection;

Doing so deletes the lost key and then the docker connexions work again.