Closed aelzeiny closed 3 years ago
Just hit this on attempting to deploy it last night into a working environment :)
2021-03-17 00:26:08,951 ERROR| Exception when executing SchedulerJob._run_scheduler_loop
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1280, in _execute
self._run_scheduler_loop()
File "/usr/local/lib/python3.8/site-packages/airflow/jobs/scheduler_job.py", line 1384, in _run_scheduler_loop
self.executor.heartbeat()
File "/usr/local/lib/python3.8/site-packages/airflow/executors/base_executor.py", line 162, in heartbeat
self.sync()
File "/usr/local/lib/python3.8/site-packages/airflow_aws_executors/batch_executor.py", line 91, in sync
describe_job_response = self._describe_tasks(all_job_ids)
File "/usr/local/lib/python3.8/site-packages/airflow_aws_executors/batch_executor.py", line 107, in _describe_tasks
boto_describe_tasks = self.batch.describe_jobs(jobs=batched_job_ids)
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 357, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 676, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.ClientException: An error occurred (ClientException) when calling the DescribeJobs operation: Error executing request, Exception : At least one JobId has to be passed, RequestId: c5cdc164-de08-4d11-b1ff-5758a511e6c8```
It crashes on anything where number_of_task_ids % 99 == 0
Exactly 99 tasks will be merged
What happens if there are EXACTLY 99 task ids? Well, the first boto3 call will schedule tasks 0-100. The second call will schedule 100-100. But 100-100 is actually just an empty list, so boto3 freaks out & the scheduler crashes... Gross