Open mik-laj opened 4 years ago
@auvipy Can you look at it? You're a Celery expert. I think Celery doesn't support it yet, but I might be wrong.
celery redis need more care actually :) with my current time and other priorities in celery i didnt contribute much on redis part. I'm more focused on amqp 1.0 and kafka support and asyncio based worker....
this can be a good improvement
Description
Hello, I recently took care of CeleryExecutor. I managed to optimize the status retrieval by using bulk operations. Instead of fetching the status for each task using a separate query, one is sent for all tasks. This has accelerated this process more than 100 times in many cases. https://github.com/apache/airflow/pull/7542 However, we still use single requests in many processes to send tasks to the queue. This is very effective because of network latency. https://github.com/apache/airflow/blob/f1dc2e0b0e358582c1df0cc07a5cc95fa721dc44/airflow/executors/celery_executor.py#L196-L206 It would be nice if it could be done as a bulk request in a single request. For Redis, this means using Pipeline. https://github.com/andymccurdy/redis-py#pipelines
Can it be done easily in Celery?
Best regards, Kamil