apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.1k stars 14.29k forks source link

EmailOperator / email on failure fails with smtp in airflow 2.0.1 #15133

Closed seybi87 closed 2 years ago

seybi87 commented 3 years ago

Apache Airflow version: 2.0.1

Kubernetes version (if you are using kubernetes) (use kubectl version): n/a

Environment:

What happened:

I have activated email notification via smtp. Therefore I have applied the following configurations in the airflow.cfg:

[email]

email_backend = airflow.utils.email.send_email_smtp

email_conn_id = smtp_default

default_email_on_retry = True

default_email_on_failure = True

[smtp]

smtp_host = smtp.strato.de
smtp_starttls = False
smtp_ssl = True
smtp_user = airflow_notification@my_email_domain.com
smtp_password = topsecret
smtp_port = 465
smtp_mail_from = airflow_notification@my_email_domain.com
smtp_timeout = 30
smtp_retry_limit = 5

This email account has only rights to send emails but not to receive emails. The account has been tested manually via Thunderbird.

In addition I have the following email testing DAG:

from builtins import range
from datetime import timedelta

from airflow.models import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.email_operator import EmailOperator
from airflow.utils.dates import days_ago

args = {
    'owner': 'test',
    'start_date': days_ago(2),
    'email': ['my.name@my-email.de'],
    'email_on_failure': True
}

dag = DAG(
    dag_id='email',
    default_args=args,
    schedule_interval=None,
    tags=['testing']
)

send_mail = EmailOperator(
    task_id='sendmail',
    to='my.name@my-email.de',
    subject='TEST Mail from Airflow',
    html_content='Mail Contents',
    dag=dag,
)

failed_bash = BashOperator(
    task_id='run_bash',
    bash_command='exit 1',
    dag=dag,
)

send_mail >> failed_bash

What you expected to happen:

When triggering the DAG, I expect to receive a regular email from send_mail and a error report email from failed_bash. However, already the send_mail tasks fails with the following log output:'

*** Reading local file: /root/airflow/logs/email/sendmail/2021-04-01T11:38:23.507448+00:00/1.log
[2021-04-01 11:38:24,852] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: email.sendmail 2021-04-01T11:38:23.507448+00:00 [queued]>
[2021-04-01 11:38:24,884] {taskinstance.py:851} INFO - Dependencies all met for <TaskInstance: email.sendmail 2021-04-01T11:38:23.507448+00:00 [queued]>
[2021-04-01 11:38:24,884] {taskinstance.py:1042} INFO - 
--------------------------------------------------------------------------------
[2021-04-01 11:38:24,884] {taskinstance.py:1043} INFO - Starting attempt 1 of 1
[2021-04-01 11:38:24,884] {taskinstance.py:1044} INFO - 
--------------------------------------------------------------------------------
[2021-04-01 11:38:24,896] {taskinstance.py:1063} INFO - Executing <Task(EmailOperator): sendmail> on 2021-04-01T11:38:23.507448+00:00
[2021-04-01 11:38:24,900] {standard_task_runner.py:52} INFO - Started process 6267 to run task
[2021-04-01 11:38:24,913] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'email', 'sendmail', '2021-04-01T11:38:23.507448+00:00', '--job-id', '2', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/email.py', '--cfg-path', '/tmp/tmpefd6zy10', '--error-file', '/tmp/tmpjso8eczu']
[2021-04-01 11:38:24,915] {standard_task_runner.py:77} INFO - Job 2: Subtask sendmail
[2021-04-01 11:38:24,958] {local_task_job.py:146} INFO - Task exited with return code 1

I did not spot any additional logs in the scheduler that points to the cause of the failure.

How to reproduce it:

Use the above DAG with a valid smtp account

Anything else we need to know:

I also tried SendGrid (yet with a different email account) and there it worked without a problem. For SendGrid I applied the following configs:

[email]

email_backend = airflow.providers.sendgrid.utils.emailer.send_email

email_conn_id = smtp_default

default_email_on_retry = True

default_email_on_failure = True

[smtp]
smtp_host = smtp.sendgrid.net
smtp_starttls = True
smtp_ssl = False
smtp_user = apikey
smtp_password = topsecret
smtp_port = 587
smtp_mail_from = my-second-email@myemail.com
smtp_timeout = 30
smtp_retry_limit = 5
seybi87 commented 3 years ago

I figured out that special characters in the password string will cause this issue.

My password contained the following special characters:

smtp_password = ja)5%Fmkrwj@LkE@

After changing the password to a string that contains only letters and numbers the issue is resolved.

Feel free to close this issue in case these special characters are not allowed in the airflow.cfg at all.

uranusjr commented 3 years ago

I wonder if this is somehow similar to #12775. The characters you use aren’t really that special; maybe Airflow can handle them better.

henryzhangsta commented 3 years ago

FWIW I believe I also ran into this issue with the postgresql password. I had special characters (%, &, +) in it and airflow was not able to use the connection string. Changing to an alphanumeric password fixed the problem.

potiuk commented 3 years ago

FWIW I believe I also ran into this issue with the postgresql password. I had special characters (%, &, +) in it and airflow was not able to use the connection string. Changing to an alphanumeric password fixed the problem.

@henryzhangsta This is a different issue with Postgres connection URL. When you embed your password in URL, you MUST Percent-encode restricted characters (see https://datatracker.ietf.org/doc/html/rfc3986). This is part of specification and this has nothing to do with Airflow.

See: https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#handling-of-special-characters-in-connection-params

We even have a tool in Airlfow to allow you to generate connection URI with proper, URI-standard compliant encoding in Airflow to make it easier:

https://airflow.apache.org/docs/apache-airflow/stable/howto/connection.html#generating-a-connection-uri

So you can still use even the "restricted" characters as long as you properly encode them.

eladkal commented 2 years ago

Does the issue still happens in newer Airflow versions?

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] commented 2 years ago

This issue has been closed because it has not received response from the issue author.