puckel / docker-airflow

Docker Apache Airflow
Apache License 2.0
3.78k stars 543 forks source link

Incorrect padding errors from Fernet encryption #290

Open gagejustins opened 5 years ago

gagejustins commented 5 years ago

No matter what password I use or where (what OS) I run the container, adding an Airflow connection through the CLI returns this error:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 171, in get_fernet
    _fernet = Fernet(fernet_key.encode('utf-8'))
  File "/usr/local/lib/python3.6/site-packages/cryptography/fernet.py", line 34, in __init__
    key = base64.urlsafe_b64decode(key)
  File "/usr/local/lib/python3.6/base64.py", line 133, in urlsafe_b64decode
    return b64decode(s)
  File "/usr/local/lib/python3.6/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 32, in <module>
    args.func(args)
  File "/usr/local/lib/python3.6/site-packages/airflow/utils/cli.py", line 74, in wrapper
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/airflow/bin/cli.py", line 1151, in connections
    new_conn = Connection(conn_id=args.conn_id, uri=args.conn_uri)
  File "<string>", line 4, in __init__
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/state.py", line 414, in _initialize_instance
    manager.dispatch.init_failure(self, args, kwargs)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 187, in reraise
    raise value
  File "/usr/local/lib/python3.6/site-packages/sqlalchemy/orm/state.py", line 411, in _initialize_instance
    return manager.original_init(*mixed[1:], **kwargs)
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 695, in __init__
    self.parse_from_uri(uri)
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 717, in parse_from_uri
    self.password = temp_uri.password
  File "<string>", line 1, in __set__
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 735, in set_password
    fernet = get_fernet()
  File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 174, in get_fernet
    raise AirflowException("Could not create Fernet object: {}".format(ve))
airflow.exceptions.AirflowException: Could not create Fernet object: Incorrect padding

BUT: adding the login info through the UI Connections tab works totally fine. I've tried changing passwords but that doesn't help. The command I'm using:

airflow connections -a --conn_id first_conn --conn_uri postgresql://jgage:password@domain:port/schema

Any ideas?

gagejustins commented 5 years ago

The exports in entrypoint.sh weren't working - not sure why

SDubrulle commented 5 years ago

Any idea why this problem was occuring? I'm experiencing the exact same problem and have no clue on why the exports aren't working.

gagejustins commented 5 years ago

I don’t know either. I have it so that my Makefile exports the key directly in the shell every time I build a container, but then I can’t view the connections in the UI. Super weird

darshanmehta10 commented 5 years ago

@gagejustins what did you do to fix it? I am running into exactly the same problem.

gagejustins commented 5 years ago

1) Look at script/entry point.sh - you’ll see that a Python command is being run and exported as an environment variable to create the Fernet key 2) Copy the Python code and package it into an export statement with the env variable name FERNET_KEY 3) In config/airflow.cfg, the fernet key is being defined as $FERNET_KEY, so its meant to pull from whatever you set the env variable as 4) Whenever you run a container, run that export statement - I do it through Docker exec in my Makefile

Note: this is definitely not the optimal way to do it, but I haven’t been able to get it to work at all in any other way. I tried putting it at the end of the entrypoint script, I tried running it as a command in the Dockerfile, but to no avail

gagejustins commented 5 years ago

I’m reopening the issue if both of y’all are having the same problem. My hacky solution wouldn’t work if you’re using multiple containers, and doesn’t let you use the connections tab in the UI, so we need to find a way to fix this

iter-io commented 5 years ago

@gagejustins - I'm working on this same issue right now as well. I will let you know what I come up with.

gagejustins commented 5 years ago

Please do, so I don't need to be embarrassed when I show my personal Airflow setup to my coworkers 🥇

iter-io commented 5 years ago

Here are my logs related to this issue:

/entrypoint.sh: line 5: REDIS_HOST:=redis: command not found
/entrypoint.sh: line 6: REDIS_PORT:=6379: command not found
/entrypoint.sh: line 7: REDIS_PASSWORD:=: command not found
/entrypoint.sh: line 9: POSTGRES_HOST:=postgres: command not found
/entrypoint.sh: line 10: POSTGRES_PORT:=5432: command not found
/entrypoint.sh: line 11: POSTGRES_USER:=airflow: command not found
/entrypoint.sh: line 12: POSTGRES_PASSWORD:=airflow: command not found
/entrypoint.sh: line 13: POSTGRES_DB:=airflow: command not found 

The variable assignments are being executed as commands.

gagejustins commented 5 years ago

Would triple quotes solve the problem?

iter-io commented 5 years ago

@gagejustins - I am no longer able to reproduce this error. Not sure exactly was causing it before.

gagejustins commented 5 years ago

Jealous! Also where are your logs stored? Can not for the life of me find them the way this image is set up

PedramNavid commented 5 years ago

I'm getting it too and can't quite figure it out.

gagejustins commented 5 years ago

@puckel any idea what's up here? Happy to fix if you have any insights

rsivapr commented 5 years ago

What is your value for FERNET_KEY?

PedramNavid commented 5 years ago

ah, think there's the problem..guess we need to set a key? The readme says: By default docker-airflow generates the fernet_key at startup but I don't see it being done.

airflow@c522d5b593e5:~$ grep -i fernet airflow.cfg
fernet_key = $FERNET_KEY
airflow@c522d5b593e5:~$ echo $FERNET_KEY

airflow@c522d5b593e5:~$
gagejustins commented 5 years ago

@PedramNavid it's done in the script/entrypoint.sh file

eduard-sukharev commented 5 years ago

Same issue here: echo $FERNET_KEY inside container gives nothing, which results in Incorrect padding exception. I cannot add connection neither with environment variables passed in from docker-compose.yml (they're actually correctly set inside running airflow container), nor from running airflow connections -a --conn_id postgres_staging --conn_uri="postgresql://airflow_user:airflow_password@postgres_stage.airflow.local:5432/airflow_user". Does anyone have solution yet?

eduard-sukharev commented 5 years ago

Thanks to @PedramNavid , setting FERNET_KEY env variable works, but is a workaround, IMO. Keep in mind that Fernet key must be 32 url-safe base64-encoded bytes, so doing openssl rand -base64 32 should generate you safe valid fernet key.

happyshows commented 5 years ago

Same error message

FranciscoCanas commented 5 years ago

Also experiencing this issue 👍

venuktan commented 5 years ago

same problem here as well

oskar-j commented 5 years ago

~I had the same error on a fresh airflow install from the pip~

Update: I just had an old config file without the fernet key. Now it works fine.

Also, I followed those steps: https://airflow.apache.org/howto/secure-connections.html?highlight=fernet

stefpe commented 5 years ago

running openssl rand -base64 32 helped me for updating from 1.9 to 1.10.1

cepefernando commented 5 years ago

I had the same issue, it was due to the fact that my Fernet keys were not the same on all the Airflow containers (webserver, scheduler and workers), which are being passed into Airflow via the Docker Env FERNET_KEY. Once I confirmed that all the containers had the same Fernet key this problem was solved.

rupert160 commented 5 years ago

I have v1.10.1 and airflow initdb for postgesql 11 failed too. I required:

--python
>>> from cryptography.fernet import Fernet
>>> fernet_key= Fernet.generate_key()
>>> print(fernet_key.decode())
somelongkeyval
then

--bash export FERNET_KEY='somelongkeyval'; airflow initdb;

socar-thomas commented 5 years ago

In my case, the fernet_key was loaded as '{FERNET_KEY}' when I tested 'airflow test ...' command. (I wrote some logs at the 'airflow/models/init.py' file) It means that os env. variables are not called correctly when the 'airflow.cfg' is loaded. So I wrote my fernet_key to 'airflow.cfg' directly. [core] ... fernet_key = f0e...... ... And It worked!

HenrryVargas commented 5 years ago

HI

In my case, this work for me in bash: FERNET_KEY=$(python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)") export FERNET_KEY=$FERNET_KEY

jaul commented 5 years ago

If you are looking for a consistent fernet key across executions to run on your host for development, you can use

: "${AIRFLOW__CORE__FERNET_KEY:=${FERNET_KEY:=$(python -c "import platform, hashlib, base64;print(base64.urlsafe_b64encode(hashlib.scrypt(platform.node().encode('utf8'),salt=platform.os.getenv('USER').encode('utf8'), n=2, r=8, p=1, dklen=32)))")}}"

on script/entrypoint.sh and it will create a consistent key.

jonathanlxy commented 5 years ago

I was running into the same issue on Mac and found out it was my misunderstanding of how shell works:

The way I interact with the container was by docker exec -it {my-container-name} bash, which opens another process beside the original process that runs entrypoint.sh

Since it's a separate process, it doesn't have access to the environment variables exported by the entrypoint.sh. Therefore if I do echo $AIRFLOW__CORE__FERNET_KEY it returns null value, which is the reason of Incorrect padding errors

Now if I do source /entrypoint.sh before running airflow connection, the connection will be successfully added, but that means the fernet_key used will be different from the airflow.cfg. Therefore I guess the best way is to manually generate a fernet_key and pass it as an environment variable when you do the docker run, e.g. --env FERNET_KEY={my_key}

Hope this helps

anshajgoel commented 5 years ago

Had the same issue, followed these steps to generate the Fernet key and replaced it in the airflow.cfg file. Worked out well!

apurvis commented 5 years ago

i was having no problems with this until i ran airflow resetdb... now it seems i cannot connect to the postgres instance any more and i get this error.

javidy commented 5 years ago

I'm using mount volume in Postgres container to save connections which I create in the first-run That way I think I'll be able to save connections across all container runs (I assume my understanding is correct about the way of saving airflow connections), and then I spin up my containers (webserver and postgres) with local executor. Then I create a connection and bring down containers and bring them up again. After that, I'm not able to edit the newly created connection. It gives me an invalid token error. Does anyone have a clue?

I've tried to set FERNET_KEY in the docker-compose file as suggested in the readme, but didn't help.

javidy commented 5 years ago

I found a workaround to my problem. The problem was that if I created a connection and restarted the webserver container, then the connection would lost. I had to create my connection over again each time I restart containers.

My goal was to create airflow connections once only using UI and make them persist across container restarts. Therefore:

  1. I created a custom volume on Postgres in docker-compose-LocalExecutor.yml to persist metadata
  2. Spinned up my containers first time docker-compose -f docker-compose-LocalExecutor.yml up -d
  3. Created my connection from UI
  4. Restarted containers

After restart, I could not use the new connection as password could not be decrypted. There was an "invalid token" error. This is because each time you run containers entrypoint.sh generates new fernet_key and that key wasn't matching with the key airflow encrypted my password. To solve the issue, I had to comment out this line from entrypoint.sh and hardcode fernet_key generated from first run (Step 2).
#: "${AIRFLOW__CORE__FERNET_KEY:=${FERNET_KEY:=$(python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)")}}" : "${AIRFLOW__CORE__FERNET_KEY:=${FERNET_KEY:="myfernetkey="}}"

I guess this could be achieved by hardcoding fernet_key in airflow.cfg file as well, but remember that Airflow will use environment variable over the value in airflow.cfg. So make sure AIRFLOWCOREFERNET_KEY is not set if you want to achieve the same using airflow.cfg

KarthikRajashekaran commented 5 years ago

Hi @JavidY Could you please share steps for the below step

I created a custom volume on Postgres in docker-compose-LocalExecutor.yml to persist metadata

javidy commented 5 years ago

Hi @KarthikRajashekaran ,

For creating volume I've modified docker-compose-LocalExecutor.yml. I've added a new section for volume under postgres container. Additionally, I've declared the new volume at the very bottom of the .yml file. The name (source) for the new volume is "airflow_metadata". You can pick whatever name you want. And destination for the volume is "/var/lib/postgresql/data" which is the default location for database files for. That is where all metadata is written.

For postgres container in compose file now whole section looks like below.

postgres:
    image: postgres:9.6
    environment:
        - POSTGRES_USER=airflow
        - POSTGRES_PASSWORD=airflow
        - POSTGRES_DB=airflow
    volumes:
        - airflow_metadata:/var/lib/postgresql/data

And declaration of new volume at the end of file is like this:

volumes: airflow_metadata:

KarthikRajashekaran commented 5 years ago

Got an error ERROR: yaml.scanner.ScannerError: mapping values are not allowed here in "./docker-compose-LocalExecutor.yml", line 37, column 34

KarthikRajashekaran commented 5 years ago

@JavidY Please check below

image

smdelacruz commented 5 years ago

I am having this issue on current Airflow 1.10.6. Will appreciate any hints.

javidy commented 5 years ago

@KarthikRajashekaran this is how my .yml file looks like: image

KarthikRajashekaran commented 5 years ago

@JavidY Thanks able to spin up .. I followed the steps you have mentioned 1-4

Have added connections and variable to UI . Stopped the container and re-ran the docker-compose ...I lost those connections and variables again

javidy commented 5 years ago

@KarthikRajashekaran did you rebuild the image after hardcoding fernet key into local entrypoint.sh file?

I can see that you're not using your local image but puckel/docker-airflow:latest. In this case, your webserver container is still using entrypoint.sh from puckel/docker-airflow:latest image. So, in a way your changes in entrypoint.sh is not regarded.

KarthikRajashekaran commented 5 years ago

I have restarted using docker-compose . Then manually added the connections to UI ..

docker-compose -f docker-compose-LocalExecutor.yml up -d

javidy commented 5 years ago

That is ok. But before adding connections you need to change entrypoint.sh file as I mentioned in my second comment in this thread. You need to generate fernet key and hardcode it inside the shell file.

You need to carry out all the steps from my comment to achieve the desired result.

krajashekaranvonage commented 5 years ago

@JavidY It worked well .thanks a lot 👍

javidy commented 5 years ago

you're welcome @KarthikRajashekaran :)

krajashekaranvonage commented 5 years ago

@JavidY I am trying to write to file in as below . but it failed

Error as

 No such file or directory: '/usr/local/airflow/tmp/snowflake_roles.csv

part of code as below

TMP_DIRECTORY = os.path.join(os.path.abspath("."), "tmp")
  with open(ROLES_PATH, "w") as roles_file, open(
        ROLE_GRANTS_PATH, "w"
    )
zakkg3 commented 4 years ago

be shure to copy /paste all the output from python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)" including the trailing "="

Example.

python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print(FERNET_KEY)"
6cLsuD9kKqr70xN5PKlFgJuGahER3DKmWtyseR8dZIA=

in mac if you doble click on the key, it will not select the traling '=' 🗡

SolbiatiAlessandro commented 4 years ago

for me setting this in my docker-compose.yaml worked

  webserver:
      build: .
      restart: always
      depends_on:
          - postgres
      environment:
          - LOAD_EX=y
          - EXECUTOR=Local
          - AIRFLOW__CORE__FERNET_KEY="<generated-key>"
zachliu commented 4 years ago

I have the same issue while I was trying to add a new variable using the UI Admin tab -> Variables. The UI showed me an error message Could not create Fernet object: Incorrect padding

In my case, as @eduard-sukharev pointed out that Fernet key must be 32 url-safe base64-encoded bytes, but apparently I created one with illegal characters :joy:

Solution is simply, I created a new one using openssl rand -base64 32, put it in our secret storage (we use Vault), and did the following

airflow@cdcf48897012:~$ /entrypoint.sh airflow resetdb

The entrypoint.sh contains the routine of reading the secrets from Vault, so it has to go first.