Open ermalguni opened 5 years ago
Howdy @ermalguni -- I see your bug report is for celery 4.3.0 can you try the latest release candidate for celery: pip install celery==4.4.0rc3 and pin your kombu version to kombu==4.6.3
Use 4.6.3 for now because 4.6.4 introduced a redis bug I a trying to sort out over here: https://github.com/celery/kombu/pull/1089
If you can also consider if you have picked an appropriate value for prefetch-multiplier, the current docs says this: If you have many tasks with a long duration you want the multiplier value to be one: meaning it’ll only reserve one task per worker process at a time.
However – If you have many short-running tasks, and throughput/round trip latency is important to you, this number should be large. The worker is able to process more tasks per second if the messages have already been prefetched, and is available in memory. You may have to experiment to find the best value that works for you. Values like 50 or 150 might make sense in these circumstances. Say 64, or 128. https://docs.celeryproject.org/en/latest/userguide/optimizing.html#prefetch-limits
If you can run some more tests on those package versions I would like to hear if this is still an issue or not.
Hi,
We have celery workers running on EC2 instances. Those instances scale up/down based on CPU load. On normal load the max latency is <= 100ms. When we start putting more load and scaling kicks in then the max latency goes up to >= 1000ms. Our current guess is that more celery workers are added or removed somehow kombu has a hard time coordinating between them and delivering tasks.
We are running our celery workers in supervisor: