aelzeiny / airflow-aws-executors

Airflow Executor for both AWS ECS & AWS Fargate
MIT License
53 stars 9 forks source link

Scheduler crashes at EXACTLY 99 tasks. #11

Closed aelzeiny closed 3 years ago

aelzeiny commented 3 years ago

What happens if there are EXACTLY 99 task ids? Well, the first boto3 call will schedule tasks 0-100. The second call will schedule 100-100. But 100-100 is actually just an empty list, so boto3 freaks out & the scheduler crashes... Gross

leonsmith commented 3 years ago

Just hit this on attempting to deploy it last night into a working environment :)


2021-03-17 00:26:08,951 ERROR| Exception when executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1280, in _execute
    self._run_scheduler_loop()
  File "/usr/local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1384, in _run_scheduler_loop
    self.executor.heartbeat()
  File "/usr/local/lib/python3.8/site-packages/airflow/executors/base_executor.py", line 162, in heartbeat
    self.sync()
  File "/usr/local/lib/python3.8/site-packages/airflow_aws_executors/batch_executor.py", line 91, in sync
    describe_job_response = self._describe_tasks(all_job_ids)
  File "/usr/local/lib/python3.8/site-packages/airflow_aws_executors/batch_executor.py", line 107, in _describe_tasks
    boto_describe_tasks = self.batch.describe_jobs(jobs=batched_job_ids)
  File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the DescribeJobs operation: Error executing request, Exception : At least one JobId has to be passed, RequestId: c5cdc164-de08-4d11-b1ff-5758a511e6c8```
leonsmith commented 3 years ago

It crashes on anything where number_of_task_ids % 99 == 0

aelzeiny commented 3 years ago

Exactly 99 tasks will be merged