aelzeiny / airflow-aws-executors

Airflow Executor for both AWS ECS & AWS Fargate
MIT License
52 stars 9 forks source link

Add support for adopting orphaned task instances #15

Open mattellis opened 3 years ago

mattellis commented 3 years ago

Addresses this issue: https://github.com/aelzeiny/airflow-aws-executors/issues/14

Adds optional support for "adopting orphaned task instances" within the Batch and Fargate executors. This means that instead of terminating batch jobs / fargate tasks when the scheduler / executor are shutting down, they instead leave the tasks running. When a new scheduler / executor boots up, it will try to "adopt" the orphaned tasks by using the external_executor_id of the orphan task instances to resume synchronising respective task statuses from Batch / Fargate.

Airflow documentation on this behaviour is limited, though there is some basic context in the scheduler "tunables" doc: https://airflow.apache.org/docs/apache-airflow/stable/scheduler.html#scheduler-tuneables

This feature is disabled by default, but can be enabled by setting the following conf option in either executor:

[batch | ecs_fargate]
...
adopt_task_instances = True

or by env var:

AIRFLOW__BATCH__ADOPT_TASK_INSTANCES=True
AIRFLOW__ECS_FARGATE__ADOPT_TASK_INSTANCES=True

In order to support adoption of orphaned tasks, the BatchExecutor just needs to store the AWS Batch job_id in the TaskInstance.external_executor_id field when it submits a job, and then implement the BaseExecutor.try_adopt_task_instances method. This method simply needs to put the orphaned task instance key and external_executor_id attributes in the active_workers.add_job method of the newly booted executor.

The Fargate executor can support task adoption with the exact same flow, by storing the Fargate task_arn field in the external_executor_id. The Fargate executor needs to make a call to describe_tasks() in the try_adopt_task_instances method (using the orphaned task arns), in order to get the full Fargate task attributes required in its active_workers collection.