apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
37.12k stars 14.3k forks source link

Airflow webserver should handle signals immediately when starting #8687

Closed BasPH closed 2 years ago

BasPH commented 4 years ago

Apache Airflow version:

1.10.10

Kubernetes version (if you are using kubernetes) (use kubectl version):

Environment:

What happened: I started the Airflow webserver on an occupied port (8080), and it kept trying to reconnect. So I wanted to quit with ctrl+C, but the signal is not handled. Only kill -9 can shutdown the process:

$ airflow webserver
  ____________       _____________
 ____    |__( )_________  __/__  /________      __
____  /| |_  /__  ___/_  /_ __  /_  __ \_ | /| / /
___  ___ |  / _  /   _  __/ _  / / /_/ /_ |/ |/ /
 _/_/  |_/_/  /_/    /_/    /_/  \____/____/|__/

.....

[2020-05-03 10:43:09 +0200] [7403] [INFO] Starting gunicorn 19.9.0
[2020-05-03 10:43:09 +0200] [7403] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2020-05-03 10:43:09 +0200] [7403] [ERROR] Retrying in 1 second.
[2020-05-03 10:43:10 +0200] [7403] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2020-05-03 10:43:10 +0200] [7403] [ERROR] Retrying in 1 second.
[2020-05-03 10:43:11 +0200] [7403] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2020-05-03 10:43:11 +0200] [7403] [ERROR] Retrying in 1 second.
^C[2020-05-03 10:43:12 +0200] [7403] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2020-05-03 10:43:12 +0200] [7403] [ERROR] Retrying in 1 second.
^C[2020-05-03 10:43:13 +0200] [7403] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2020-05-03 10:43:13 +0200] [7403] [ERROR] Retrying in 1 second.
^C^C^C^C^C^C^C^C^C^C^C^C^C^C[2020-05-03 10:43:14 +0200] [7403] [ERROR] Can't connect to ('0.0.0.0', 8080)
^C^C^C^C^C^C^C^CKilled: 9

You can see the ^C not leading to a shutdown of the process.

What you expected to happen: I expect the webserver to handle signals, even when the startup process hasn't completed yet.

How to reproduce it:

Anything else we need to know:

NikolasOliveira commented 4 years ago

Hey @potiuk & @turbaszek Thanks for the Airflow Contributor session this morning :)

Do you mind if I take this as my starter task?

Cheers, Niko

turbaszek commented 4 years ago

@NikolasOliveira feel free to take this one!

uranusjr commented 3 years ago

I can’t seem to replicate this. The Gunicorn subprocess can’t be kiled immediately (this seems like a Gunicorn issue and not Airflow), but after the Gunicorn subprocess emits the Can't connect to log, Ctrl-C correctly makes it exit (and thus ending the parent Airflow process.

$ airflow webserver
...
=================================================================
[2021-04-07 17:03:48 +0000] [2031] [INFO] Starting gunicorn 19.9.0
[2021-04-07 17:03:48 +0000] [2031] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:03:48 +0000] [2031] [ERROR] Retrying in 1 second.
[2021-04-07 17:03:49 +0000] [2031] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:03:49 +0000] [2031] [ERROR] Retrying in 1 second.
[2021-04-07 17:03:50 +0000] [2031] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:03:50 +0000] [2031] [ERROR] Retrying in 1 second.
[2021-04-07 17:03:51 +0000] [2031] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:03:51 +0000] [2031] [ERROR] Retrying in 1 second.
[2021-04-07 17:03:52 +0000] [2031] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:03:52 +0000] [2031] [ERROR] Retrying in 1 second.
[2021-04-07 17:03:53 +0000] [2031] [ERROR] Can't connect to ('0.0.0.0', 8080)
^C[2021-04-07 17:03:57,021] {webserver_command.py:431} INFO - Received signal: 2. Closing gunicorn.
$ echo $?
0

Pressing Ctrl-C while Gunicorn is still connecting also correctly sends a SIGINT (2) to Gunicorn, and makes it exit after the connection failures:

$ airflow webserver
...
=================================================================
[2021-04-07 17:01:53 +0000] [1923] [INFO] Starting gunicorn 19.9.0
[2021-04-07 17:01:53 +0000] [1923] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:01:53 +0000] [1923] [ERROR] Retrying in 1 second.
[2021-04-07 17:01:54 +0000] [1923] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:01:54 +0000] [1923] [ERROR] Retrying in 1 second.
^C[2021-04-07 17:01:55,025] {webserver_command.py:431} INFO - Received signal: 2. Closing gunicorn.
[2021-04-07 17:01:55 +0000] [1923] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:01:55 +0000] [1923] [ERROR] Retrying in 1 second.
[2021-04-07 17:01:56 +0000] [1923] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:01:56 +0000] [1923] [ERROR] Retrying in 1 second.
[2021-04-07 17:01:57 +0000] [1923] [ERROR] Connection in use: ('0.0.0.0', 8080)
[2021-04-07 17:01:57 +0000] [1923] [ERROR] Retrying in 1 second.
[2021-04-07 17:01:58 +0000] [1923] [ERROR] Can't connect to ('0.0.0.0', 8080)
$ echo $?
0

The same behaviour is observed between Gunicorn 19.9.0 (which the OP uses) and 19.10.0 (the version suggested by the official constraints file), against Airflow master (but I could not find any relevant differences in 1.10.10).

I wonder if the Python version is relevant here?

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.

github-actions[bot] commented 2 years ago

This issue has been closed because it has not received response from the issue author.