celery / celery

Distributed Task Queue (development branch)
https://docs.celeryq.dev
Other
24.84k stars 4.67k forks source link

billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 #7324

Open liangpinglk opened 2 years ago

liangpinglk commented 2 years ago

celery info

 -------------- celery@d7a11b1e418d v5.1.0 (sun-harmonics)
--- ***** ----- 
-- ******* ---- Linux-5.10.76-linuxkit-x86_64-with-Ubuntu-18.04-bionic 2022-02-26 00:54:25
- *** --- * --- 
- ** ---------- [config]
- ** ---------- .> app:         NLP_Platform:0x7fd915f02750
- ** ---------- .> transport:   redis://172.17.0.4:6379/0
- ** ---------- .> results:     redis://172.17.0.4:6379/0
- *** --- * --- .> concurrency: 8 (prefork)
-- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker)
--- ***** ----- 
 -------------- [queues]
                .> celery           exchange=celery(direct) key=celery

error

[2022-02-26 00:54:42,330: ERROR/MainProcess] Process 'ForkPoolWorker-4' pid:88688 exited with 'signal 6 (SIGABRT)'
[2022-02-26 00:54:42,353: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.')
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/billiard/pool.py", line 1267, in mark_as_worker_lost
    human_status(exitcode), job._job),
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0.
open-collective-bot[bot] commented 2 years ago

Hey @liangpinglk :wave:, Thank you for opening an issue. We will get back to you as soon as we can. Also, check out our Open Collective and consider backing us - every little helps!

We also offer priority support for our sponsors. If you require immediate assistance please consider sponsoring us.

liangpinglk commented 2 years ago

I found that this problem had been discussed for a long time,but it still exists now,how can i fix it?(I need multiprocess) https://github.com/celery/celery/issues/2958 https://groups.google.com/g/celery-users/c/E1kYCQySzuE?pli=1 https://stackoverflow.com/questions/56767461/celery-workerlosterror-worker-exited-prematurely-signal-6-sigabrt

nadzhou commented 1 year ago

The solution I found was to handle that exception internally and go on with your life. Like try and except and ignore that exception

JulienPalard commented 1 year ago

Got it today as: Worker exited prematurely: signal 15 (SIGTERM)

with:

software -> celery:5.2.7 (dawn-chorus) kombu:5.2.4 py:3.9.2
            billiard:3.6.4.0 redis:4.5.4
platform -> system:Linux arch:64bit, ELF
            kernel version:5.10.0-9-amd64 imp:CPython

Stack trace:

WorkerLostError: Worker exited prematurely: signal 15 (SIGTERM) Job: 11.
  File "hkis/consumers.py", line 166, in answer
    is_valid, message = await check_answer(
  File "hkis/tasks.py", line 174, in check_answer
    return await asyncio.get_running_loop().run_in_executor(
  File "concurrent/futures/thread.py", line 52, in run
    result = self.fn(*self.args, **self.kwargs)
  File "hkis/tasks.py", line 172, in sync_celery_check_answer
    return check_answer_task.apply_async((answer,), expires=60).get()
  File "celery/result.py", line 224, in get
    return self.backend.wait_for_pending(
  File "celery/backends/asynchronous.py", line 223, in wait_for_pending
    return result.maybe_throw(callback=callback, propagate=propagate)
  File "celery/result.py", line 336, in maybe_throw
    self.throw(value, self._to_remote_traceback(tb))
  File "celery/result.py", line 329, in throw
    self.on_ready.throw(*args, **kwargs)
  File "vine/promises.py", line 234, in throw
    reraise(type(exc), exc, tb)
  File "vine/utils.py", line 30, in reraise
    raise value

Happen that a process has been killed by the OOM killer.

Here's what I'm seeing from a systemd point of view after reproducing the issue:

hkis-celery.service: A process of this unit has been killed by the OOM killer.
worker: Warm shutdown (MainProcess)
[2023-04-20 09:09:17,070: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:1057626 exited with 'signal 15 (SIGTERM)'
[2023-04-20 09:09:17,288: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 15 (SIGTERM) Job: 23.')
Traceback (most recent call last):
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/worker/worker.py", line 203, in start
    self.blueprint.start(self)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start
    step.start(parent)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/bootsteps.py", line 365, in start
    return self.obj.start()
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 332, in start
    blueprint.start(self)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/bootsteps.py", line 116, in start
    step.start(parent)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/worker/consumer/consumer.py", line 628, in start
    c.loop(*c.loop_args())
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/worker/loops.py", line 97, in asynloop
    next(loop)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/kombu/asynchronous/hub.py", line 295, in create_loop
    tick_callback()
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/kombu/transport/redis.py", line 1311, in on_poll_start
    cycle_poll_start()
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/kombu/transport/redis.py", line 532, in on_poll_start
    self._register_BRPOP(channel)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/kombu/transport/redis.py", line 518, in _register_BRPOP
    channel._brpop_start()
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/kombu/transport/redis.py", line 950, in _brpop_start
    self.client.connection.send_command(*command_args)
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/redis/connection.py", line 841, in send_command
    self._command_packer.pack(*args),
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/redis/connection.py", line 554, in pack
    buff = SYM_EMPTY.join((SYM_STAR, str(len(args)).encode(), SYM_CRLF))
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/celery/apps/worker.py", line 299, in _handle_request
    raise exc(exitcode)
celery.exceptions.WorkerShutdown: 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/opt/hkis-celery/venv/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 15 (SIGTERM) Job: 23.
hkis-celery.service: Failed with result 'oom-kill'.
richardbrockie commented 1 year ago

Hi,

I've run into this error when trying to roll forward from python 3.10.9 (which works fine). I get it with both 3.10.10 and 3.10.11:

celery: 5.2.7 billiard: 3.6.4.0 Django: 3.2.16 (LTS) MacOS: 12.6.3 python: 3.10.9 works, 3.10.10 & 3.10.11 have the problem

Some celery tasks complete successfully (eg: sending email). However, it appears to be the first interaction with a Django model that triggers the problem. Here's the line in my code that generates the exception. task_model can be one of 2 different models that are tracking the processing task. task_id is valid and the .get() should return a single instance:

begin_processing_task = task_model.objects.get(id=task_id)

here's the exception:

objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. [2023-05-08 20:00:04,742: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:70333 exited with 'signal 6 (SIGABRT)' [2023-05-08 20:00:04,755: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.') Traceback (most recent call last): File "/Users/richard/VirtualEnvs/ontheday_heroku_3.10.10/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost raise WorkerLostError( billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0. [2023-05-08 20:43:00,037:

richardbrockie commented 1 year ago

Some more detail on this problem.

I'm running a project on Heroku which forces the use of psycopg2-binary instead of psycopg2. There was a recent update to v2.9.6 that was confusing me when I initially ran into the problem. Recent debugging has revealed the following:

Works: MacOS on Apple Silicon

Does not work: MacOS on Apple Silicon

Works: MacOS on Apple Silicon

Also works: MacOS on Intel silicon

So, for some reason, celery with psycopg2-binary 2.9.6 on Apple silicon is causing problems.

Any ideas??

Hi,

I've run into this error when trying to roll forward from python 3.10.9 (which works fine). I get it with both 3.10.10 and 3.10.11:

celery: 5.2.7 billiard: 3.6.4.0 Django: 3.2.16 (LTS) MacOS: 12.6.3 python: 3.10.9 works, 3.10.10 & 3.10.11 have the problem

Some celery tasks complete successfully (eg: sending email). However, it appears to be the first interaction with a Django model that triggers the problem. Here's the line in my code that generates the exception. task_model can be one of 2 different models that are tracking the processing task. task_id is valid and the .get() should return a single instance:

begin_processing_task = task_model.objects.get(id=task_id)

here's the exception:

objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. [2023-05-08 20:00:04,742: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:70333 exited with 'signal 6 (SIGABRT)' [2023-05-08 20:00:04,755: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.') Traceback (most recent call last): File "/Users/richard/VirtualEnvs/ontheday_heroku_3.10.10/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost raise WorkerLostError( billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0. [2023-05-08 20:43:00,037:

candleindark commented 1 year ago

@richardbrockie I am encountering the same problem with MacOS on Apple Silicon, Celery 5.2.7 with python 3.9.16, psycopg2 2.9.6 and psycopg2 2.9.5 (which I tried after reading this post). I didn't bother to try using a binary version of psycopg2 for it is not advised to be used for development.

This problem only occurred after I updated SQLAlchemy (to 2.0 from 1.4) and other related libraries.

This problem didn't occur if I run the Celery worker inside a docker container in the same Mac.

richardbrockie commented 1 year ago

@candleindark I also reported the problem in the psycopg2 repo: https://github.com/psycopg/psycopg2/issues/1593, where they pointed out that this is likely to be due to how macOS forks processes. What OS are you running in your docker container? I'm pretty sure it won't be macOS?

After playing around on Saturday, I now have a satisfactory work-around. I've verified that the -binary of v2.9.6 works fine in my production Heroku deployment, so am now specifying different requirements based on the OS. I have updated to python 3.10.12 as part of my fiddling:

# different versions of psycopg2 for different platforms...
psycopg2==2.9.6; sys_platform == "darwin"
psycopg2-binary==2.9.6; sys_platform == "linux"

During this past weekend, I did manage to have both psycopg2 and psycopg2-binary v2.9.6 installed side-by-side in a venv which had me thinking at times that the non-binary was also having the same problem.

Later this year I'll be rolling forward from Django 3.2.x LTS to 4.2 and be able to move to psycopg3 which I expect to be better tested in all the possible development environments.

Heroku's example app (which looks up-to-date) seems to support my suspicion where they have this: https://github.com/heroku/python-getting-started/blob/main/requirements.txt.

# Uncomment these lines to use a Postgres database. Both are needed, since in production
# (which uses Linux) we want to install from source, so that security updates from the
# underlying Heroku stack image are picked up automatically, thanks to dynamic linking.
# On other platforms/in development, the precompiled binary package is used instead, to
# speed up installation and avoid errors from missing libraries/headers.
#psycopg; sys_platform == "linux"
#psycopg[binary]; sys_platform != "linux"

@richardbrockie I am encountering the same problem with MacOS on Apple Silicon, Celery 5.2.7 with python 3.9.16, psycopg2 2.9.6 and psycopg2 2.9.5 (which I tried after reading this post). I didn't bother to try using a binary version of psycopg2 for it is not advised to be used for development.

This problem only occurred after I updated SQLAlchemy (to 2.0 from 1.4) and other related libraries.

This problem didn't occur if I run the Celery worker inside a docker container in the same Mac.

candleindark commented 1 year ago

@richardbrockie My docker container is running Debian 11.

This problem is only effecting me in development when I don't run the Celery worker in a container in my Mac. I guess I will just start running the Celery worker in a container even in development from now on.

I think I will consider switching to Psycopg 3 as well. I just use Psycopg2 for my use of SQLAlchemy. I don't use it directly. Do you know if there is anything I need to pay attention to make the switch from Psycopg2 to Psycopg3?

Thanks

frisia-mtz commented 1 year ago

Encountering the same issue.

Platform: macOS Ventura 13.4.1 Intel Python 3.9.17 and 3.10.12 celery[redis]==5.2.7 and celery[redis]==5.3.1

objc[7037]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called.
objc[7037]: +[__NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.
2023-07-04 16:11:12,075 | ERROR    | MainProcess        | Process 'ForkPoolWorker-2' pid:7037 exited with 'signal 6 (SIGABRT)'
2023-07-04 16:11:12,088 | ERROR    | MainProcess        | Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.')
Traceback (most recent call last):
  File "/Users/user/project/.venv/lib/python3.9/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost
    raise WorkerLostError(
billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0.

I found a workaround on stackoverflow: Setting OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES before executing celery solves the issue with multiprocessing on macOS.

rahilbhansali commented 1 year ago

I've been facing the same issue. I've upgraded to python 3.10 and celery 2.3.3 now. My python kept crashing and I thought it was a celery issue.

But @richardbrockie's observations helped. I was checking and downgrading all my packages except psycopg2-binary - I had 2.9.7 installed. On reverting back to 2.9.5 - the crash has stopped on my M1 Max (apple silicon mac - running OS Sonoma Public Beta).

So this works for me - python 3.10, celery 2.3.3 and psycopg2-binary 2.9.5.

auvipy commented 1 year ago

thanks all for feedback and cross checking.

dan-blanchard commented 1 year ago

I don't know that this issue should be closed. The psycopg2 folks seem to think this is an issue around how celery handles forking on macOS (as you can see in https://github.com/psycopg/psycopg2/issues/1593#issuecomment-1604096215), and the fact that downgrading psycopg2 fixes it doesn't necessarily mean they're wrong.

tholu commented 1 year ago

@auvipy Can you reopen the issue? I can reproduce it with python 3.8.18, celery 5.3.4 and psycopg2-binary 2.9.8.

rafalpietrzakio commented 1 year ago

it happens to me on Ubuntu 22.04, psycopg2-binary and newest stable Django + newest stable celery. Trying to find the reason, it seem to be running while handling massive amount of tasks (I've reproduced it by handling generating thumbnails for few thousands of images), haven't happened to me yet with any 'single' tasks. Maybe it's a clue? Can't find any other reason as for now

rafalpietrzakio commented 1 year ago

Seem that changing the type to simple instead of forking and disabling multi solves it (source: https://sam.hooke.me/note/2023/01/celery-and-systemd/)

auvipy commented 1 year ago

Seem that changing the type to simple instead of forking and disabling multi solves it (source: https://sam.hooke.me/note/2023/01/celery-and-systemd/)

that is a great one!

laurentiupiciu commented 11 months ago

Some more detail on this problem.

I'm running a project on Heroku which forces the use of psycopg2-binary instead of psycopg2. There was a recent update to v2.9.6 that was confusing me when I initially ran into the problem. Recent debugging has revealed the following:

Works: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.5

Does not work: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.6

Works: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2 2.9.6 (no -binary)

Also works: MacOS on Intel silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.6 (with -binary)

So, for some reason, celery with psycopg2-binary 2.9.6 on Apple silicon is causing problems.

Any ideas??

Hi, I've run into this error when trying to roll forward from python 3.10.9 (which works fine). I get it with both 3.10.10 and 3.10.11: celery: 5.2.7 billiard: 3.6.4.0 Django: 3.2.16 (LTS) MacOS: 12.6.3 python: 3.10.9 works, 3.10.10 & 3.10.11 have the problem Some celery tasks complete successfully (eg: sending email). However, it appears to be the first interaction with a Django model that triggers the problem. Here's the line in my code that generates the exception. task_model can be one of 2 different models that are tracking the processing task. task_id is valid and the .get() should return a single instance: begin_processing_task = task_model.objects.get(id=task_id) here's the exception:

objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. [2023-05-08 20:00:04,742: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:70333 exited with 'signal 6 (SIGABRT)' [2023-05-08 20:00:04,755: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.') Traceback (most recent call last): File "/Users/richard/VirtualEnvs/ontheday_heroku_3.10.10/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost raise WorkerLostError( billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0. [2023-05-08 20:43:00,037:

Thanks a lot @richardbrockie

anuchitorigin commented 11 months ago

Hi, there I am very new to Python. I am using FastAPI with Celery Worker. Everything seems fine until I try to use PyGAD which it needs NumPy. The problem is whenever I 'import NumPy' in any python files, the error always occurs like this. "billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6" If anyone could help, it would be great. Thanks in advance.

heyman commented 11 months ago

Later this year I'll be rolling forward from Django 3.2.x LTS to 4.2 and be able to move to psycopg3 which I expect to be better tested in all the possible development environments.

I'm seeing the same error with Django 4.2 and the latest version of psycopg3 (pyscopg[binary] version 3.1.13). Downgrading to psycopg2-binary 2.9.5 fixes the problem.

SabinoGs commented 10 months ago

I'm having the same issue with celery + ultralytics.

But I found a workaround to keep developing by running celery with --concurrency 1 and --pool solo. Here the config I used on vscode launch.json:

"args": [
                "-A", 
                "celery_task_app.worker", 
                "worker",
                "-c", 
                "1",
                "--pool",
                "solo",
                "--loglevel=info"
            ],

Platform: Apple Silicon

richardbrockie commented 10 months ago

Thanks for the pointer to the --pool flag: this reminded me that I still have eventlet in my requirements list from when I was developing on Windows. I've confirmed that both --pool solo and --pool eventlet avoid the problem on Apple Silicon.

I'm having the same issue with celery + ultralytics.

But I found a workaround to keep developing by running celery with --concurrency 1 and --pool solo. Here the config I used on vscode launch.json:

"args": [
                "-A", 
                "celery_task_app.worker", 
                "worker",
                "-c", 
                "1",
                "--pool",
                "solo",
                "--loglevel=info"
            ],

Platform: Apple Silicon

richardbrockie commented 10 months ago

Later this year I'll be rolling forward from Django 3.2.x LTS to 4.2 and be able to move to psycopg3 which I expect to be better tested in all the possible development environments.

I'm seeing the same error with Django 4.2 and the latest version of psycopg3 (pyscopg[binary] version 3.1.13). Downgrading to psycopg2-binary 2.9.5 fixes the problem.

That's disappointing! :(

Can you comment whether setting the --pool solo option when running celery solves the problem?

richardbrockie commented 10 months ago

@heyman As expected, I also see the problem with psycopg[binary]. Setting the --pool flag resolves the problem as it does for later versions of psycopg2.

heyman commented 10 months ago

Setting the --pool flag resolves the problem as it does for later versions of psycopg2.

I guess that works if you're fine with only running a single worker. For many projects I'm not though.

danielmcquillen commented 8 months ago

Also experiencing this issue during Django/Celery development ( MacOS 13.6.4, python 3.11.6, psycopg[binary]==3.1.18, django==4.2.11, redis==5.0.3, celery==5.3.6)

...can confirm adding --concurrency 1 --pool solo to my celery start script when developing locally worked.

nirmagor commented 6 months ago

Some more detail on this problem. I'm running a project on Heroku which forces the use of psycopg2-binary instead of psycopg2. There was a recent update to v2.9.6 that was confusing me when I initially ran into the problem. Recent debugging has revealed the following: Works: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.5

Does not work: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.6

Works: MacOS on Apple Silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2 2.9.6 (no -binary)

Also works: MacOS on Intel silicon

  • Celery 5.2.7 with python 3.10.9 with psycopg2-binary 2.9.6 (with -binary)

So, for some reason, celery with psycopg2-binary 2.9.6 on Apple silicon is causing problems. Any ideas??

Hi, I've run into this error when trying to roll forward from python 3.10.9 (which works fine). I get it with both 3.10.10 and 3.10.11: celery: 5.2.7 billiard: 3.6.4.0 Django: 3.2.16 (LTS) MacOS: 12.6.3 python: 3.10.9 works, 3.10.10 & 3.10.11 have the problem Some celery tasks complete successfully (eg: sending email). However, it appears to be the first interaction with a Django model that triggers the problem. Here's the line in my code that generates the exception. task_model can be one of 2 different models that are tracking the processing task. task_id is valid and the .get() should return a single instance: begin_processing_task = task_model.objects.get(id=task_id) here's the exception:

objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. objc[70333]: +[NSCFConstantString initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug. [2023-05-08 20:00:04,742: ERROR/MainProcess] Process 'ForkPoolWorker-1' pid:70333 exited with 'signal 6 (SIGABRT)' [2023-05-08 20:00:04,755: ERROR/MainProcess] Task handler raised error: WorkerLostError('Worker exited prematurely: signal 6 (SIGABRT) Job: 0.') Traceback (most recent call last): File "/Users/richard/VirtualEnvs/ontheday_heroku_3.10.10/lib/python3.10/site-packages/billiard/pool.py", line 1265, in mark_as_worker_lost raise WorkerLostError( billiard.exceptions.WorkerLostError: Worker exited prematurely: signal 6 (SIGABRT) Job: 0. [2023-05-08 20:43:00,037:

Thanks a lot @richardbrockie

I have macOS Sonoma 14.4.1 intel sillicon 8 core

python 3.10.11 celery: ^5.4.0 / 5.2.7 psycopg[binary]: 2.9.5/2.9.6

ran with -c 4 Tried all combinations and I still have the same issue. Although some jobs do not raise this error most of them do.

alexanderbaumann-toast commented 2 months ago

I'm having the same issue with celery + ultralytics.

But I found a workaround to keep developing by running celery with --concurrency 1 and --pool solo. Here the config I used on vscode launch.json:

"args": [
                "-A", 
                "celery_task_app.worker", 
                "worker",
                "-c", 
                "1",
                "--pool",
                "solo",
                "--loglevel=info"
            ],

Platform: Apple Silicon

Thank you! -c 1 and --pool solo resolved the error.

bengabp commented 2 months ago

Hi, guys , its been 2 years, did any find a workaround without setting the pool size to 1 ?

JulianPinzaru commented 4 weeks ago

Ran into this issue recently as well. My temporary fix was to use --pool=threads celery -A proj worker --pool=threads

However, it would be preferable to have similar multiprocessing in a local dev environment...

JulianPinzaru commented 4 weeks ago

Has anyone found any better way to address it ?