Open spulec opened 12 years ago
We are observing similar problems on:
software -> celery:3.0.10 (Chiastic Slide) kombu:2.4.7 py:2.7.3
billiard:2.7.3.15 pymongo:2.3
platform -> system:Linux arch:64bit, ELF imp:CPython
loader -> djcelery.loaders.DjangoLoader
settings -> transport:mongodb results:mongodb
When running:
python manage.py celery worker --loglevel=info --concurrency=4 --maxtasksperchild=10
But it does not happen when run as:
python manage.py celery worker --loglevel=info --concurrency=1 --maxtasksperchild=10
or:
python manage.py celery worker --loglevel=info --concurrency=4 --maxtasksperchild=1
OK, it seems that with --concurrency=4 --maxtasksperchild=10
errors are more common, but they also happen with --concurrency=4 --maxtasksperchild=1
.
I am also seeing this error with the latest stable release where celery crashes after a few hours of long running tasks. I'm seeing this happen after an hour long job has timed out.
[2012-10-22 03:22:13,673: WARNING/PoolWorker-1] Process PoolWorker-1: [2012-10-22 03:22:13,674: WARNING/PoolWorker-1] Traceback (most recent call last): [2012-10-22 03:22:13,675: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/process.py", line 248, in _bootstrap [2012-10-22 03:22:13,676: WARNING/PoolWorker-1] self.run() [2012-10-22 03:22:13,677: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/process.py", line 97, in run [2012-10-22 03:22:13,678: WARNING/PoolWorker-1] self._target(_self._args, *_self._kwargs) [2012-10-22 03:22:13,678: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/pool.py", line 308, in worker [2012-10-22 03:22:13,679: WARNING/PoolWorker-1] put((READY, (job, i, (False, einfo)))) [2012-10-22 03:22:13,679: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/queues.py", line 352, in put [2012-10-22 03:22:13,680: WARNING/PoolWorker-1] return send(obj) [2012-10-22 03:22:13,680: WARNING/PoolWorker-1] IOError: [Errno 32] Broken pipe
software -> celery:3.0.11 (Chiastic Slide) kombu:2.4.7 py:2.6.6 billiard:2.7.3.17 amqplib:N/A platform -> system:Linux arch:64bit, ELF imp:CPython loader -> celery.loaders.default.Loader settings -> transport:amqp results:disabled
BROKER_URL: 'amqp://something/myvhost' CELERYD_CONCURRENCY: 6 CELERY_IMPORTS: ('celery_tasks',) CELERYD_PREFETCH_MULTIPLIER: 1
@9thbit How did it time out? And do you think it's relevant?
@ask I was simply monitoring the tasks using top
and just based on how long the job was running, it seemed to crash after one of the tasks exceeded time_limit.
I have added CELERYD_FORCE_EXECV = True
to my config and I'll report back if that resolved the issue. Also, if it is related to using python 2.6 -- which is on the cluster I am using -- I can try 2.7.
I am using 2.7 and I am getting this.
Python 2.7 here with CELERYD_FORCE_EXECV = True
.
Seeing this too with --concurrency=4
and kombu
. Ugh.
Same issue for us. It's not stable at all in production!
Same issue here.
Could you please include the version your are using?
Also what broker are you using? (kombu is not a transport, it's the messaging framework we use)
The latest celery version disabled force_execv by default, it could be a culprit.
If you're using RabbitMQ/redis, could you please try running with CELERY_DISABLE_RATE_LIMITS=True
?
I'm are also seeing this crash. I also tried the CELERY_DISABLE_RATE_LIMITS=True
suggestion but that didn't fix the error.
Unrecoverable error: IOError(32, 'Broken pipe')
Stacktrace (most recent call last):
File "celery/worker/__init__.py", line 351, in start
component.start()
File "celery/worker/consumer.py", line 393, in start
self.consume_messages()
File "celery/worker/consumer.py", line 483, in consume_messages
handlermap[fileno](fileno, event)
File "billiard/pool.py", line 1039, in maintain_pool
self._maintain_pool()
File "billiard/pool.py", line 1034, in _maintain_pool
self._repopulate_pool(self._join_exited_workers())
File "billiard/pool.py", line 1020, in _repopulate_pool
self._create_worker_process(self._avail_index())
File "billiard/pool.py", line 904, in _create_worker_process
w.start()
File "billiard/process.py", line 120, in start
self._popen = Popen(self)
File "billiard/forking.py", line 192, in __init__
code = process_obj._bootstrap()
File "billiard/process.py", line 276, in _bootstrap
sys.stdout.flush()
Here's our sys.argv
:
['manage.py', 'celeryd', '--loglevel=INFO', '-c', '4', '--queues=celery', '--maxtasksperchild=100']
@jdp what celery/billiard version is this? please include the output of celery report
(but make sure to remove sensitive information)
If this happens while flushing stdout, then it may be worth a try to simply ignore this error. It does not have a stdout at this point anyway.
Closing this issue now, please open a new issue if this happen again. Chances are it's not the same issue as the previous one.
I met the same problem. Traceback (most recent call last): File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 292, in call sys.exit(self.workloop(pid=pid)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 374, in workloop put((READY, (job, i, (False, einfo), inqW_fd))) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 366, in put self.send_payload(ForkingPickler.dumps(obj)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 403, in send_payload self._writer.send_bytes(value) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 227, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 453, in _send_bytes self._send(header + buf) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 406, in _send n = write(self._handle, buf) OSError: [Errno 32] Broken pipe
I got this.It confused me for one week..
I met the same problem. Traceback (most recent call last): File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 292, in call sys.exit(self.workloop(pid=pid)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 374, in workloop put((READY, (job, i, (False, einfo), inqW_fd))) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 366, in put self.send_payload(ForkingPickler.dumps(obj)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 403, in send_payload self._writer.send_bytes(value) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 227, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 453, in _send_bytes self._send(header + buf) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 406, in _send n = write(self._handle, buf) OSError: [Errno 32] Broken pipe
I got this.It confused me for one week..
which version are you using?
This makes our workers crash/hang every hour or so.
celery==3.0.7 django-celery==3.0.4 kombu==2.4.3 billiard==2.7.3.12
We're using the SQS backend.