Multiple celery fork pool workers don't work

celery / kombu

Messaging library for Python.

http://kombu.readthedocs.org/

BSD 3-Clause "New" or "Revised" License

2.83k stars 922 forks source link

Multiple celery fork pool workers don't work #1785

Open lzgabel opened 10 months ago

lzgabel commented 10 months ago

ENV: Broker: Redis Celery: 5.2.7 Python: 3.9.0

👋 Hi All. FYI, since 5.3.2 was released, we encountered multiple celery fork pool workers don't work, so we rolled back to 5.3.1 and everything returned to normal.

Version: 5.3.2

Version: 5.3.1

🤔 We compared the kombu version changes, and when we reverted this PR: #1733 in version 5.3.2, all workers worked normally.

cc @auvipy @Nusnus @mfaw @mbierma.

mbierma commented 10 months ago

@lzgabel do you have an example to repo the issue?

lzgabel commented 10 months ago

Hi @mbierma. After our investigation, due to the following code existing in the system:

@celeryd_init.connect
def recover_job(sender=None, conf=None, **kwargs) -> None:
    i = app.control.inspect()
    running_list = []
    for worker_list in i.active().values():
        for task_item in worker_list:
            name = task_item.get('name')
            # do something
            args = task_item.get('args')
            job_id = args[1]
            if job_id:
                running_list.append(job_id)
        logger.info(f'lost job : {running_list}')

It will be blocked here _brpop_read() because of the content fixed by this PR #1733.

mfaw commented 10 months ago

hi @lzgabel , I think this is different than my issue. My issue is not fork pool workers but celery workers themselves. I can have 2 celery workers when I'm using rabbitmq as a broker but, when I use kafka as a broker, only one celery worker is working.

ojnas commented 8 months ago

We have exactly the same problem. Downgrading kombu to 5.3.1 solves it.

thuibr commented 3 months ago

hi @lzgabel , I think this is different than my issue. My issue is not fork pool workers but celery workers themselves. I can have 2 celery workers when I'm using rabbitmq as a broker but, when I use kafka as a broker, only one celery worker is working.

@mfaw is this still the case? If so then maybe I'll put it as a limitation in my celery documentation ticket here https://github.com/celery/celery/pull/8935

thuibr commented 3 months ago

It seems that if you specify another queue, like add instead of the default celery queue, then the task is routed to a new topic called add. If you do that then you can also create a second worker listening on the add queue. Then you can have two working. It is unfortunate that you can't have two workers listening to the same queue though. That definitely seems like a regression. I wonder if we can construct a test case for this.

thuibr commented 3 months ago

I wonder if we can use partitions to allow for multiple workers.

ojnas commented 2 months ago

Is there any plan to fix the issue reported by @lzgabel or is there any known workaround? It's still preventing us from upgrading kombu and celery beyond 5.3.1.

thuibr commented 2 months ago

Is there any plan to fix the issue reported by @lzgabel or is there any known workaround? It's still preventing us from upgrading kombu and celery beyond 5.3.1.

I don't think that there is a maintainer for Kafka at the moment.

ojnas commented 2 months ago

Is there any plan to fix the issue reported by @lzgabel or is there any known workaround? It's still preventing us from upgrading kombu and celery beyond 5.3.1.

I don't think that there is a maintainer for Kafka at the moment.

The original issue reported by @lzgabel is not for Kafka but using Redis as broker, which we are also using.

thuibr commented 1 day ago

@lzgabel @ojnas I am unable to reproduce the issue using Celery 5.2.7 and Kombu 5.3.2, Redis-Py 5.0.7, and Python 3.9.0. What version of Redis are you using? I am using

$ redis-server --version
Redis server v=6.0.16 sha=00000000:0 malloc=jemalloc-5.2.1 bits=64 build=a3fdef44459b3ad6

[2024-07-28 10:38:49,622: INFO/ForkPoolWorker-1] Task tasks.add[8adf78ef-53ff-4845-80d3-2582089060fb] succeeded in 10.009252441999706s: 8
[2024-07-28 10:38:59,371: INFO/ForkPoolWorker-2] Task tasks.add[b59745df-bef0-48be-9667-691dc1893aa9] succeeded in 10.008264554000107s: 8

Additionally, have you tried the latest versions of Kombu and Celery?