celery / billiard

Multiprocessing Pool Extensions
Other
417 stars 252 forks source link

Crashes with broken pipe #16

Open spulec opened 12 years ago

spulec commented 12 years ago

This makes our workers crash/hang every hour or so.

Traceback (most recent call last):
File "../lib/python2.7/site-packages/billiard/process.py", line 273, in _bootstrap
self.run()
File "../lib/python2.7/site-packages/billiard/process.py", line 122, in run
self._target(*self._args, **self._kwargs)
File "../lib/python2.7/site-packages/billiard/pool.py", line 302, in worker
put((ACK, (job, i, time.time(), pid)))
File "../lib/python2.7/site-packages/billiard/queues.py", line 377, in put
return send(obj)
IOError: [Errno 32] Broken pipe

celery==3.0.7 django-celery==3.0.4 kombu==2.4.3 billiard==2.7.3.12

We're using the SQS backend.

mitar commented 12 years ago

We are observing similar problems on:

software -> celery:3.0.10 (Chiastic Slide) kombu:2.4.7 py:2.7.3
            billiard:2.7.3.15 pymongo:2.3
platform -> system:Linux arch:64bit, ELF imp:CPython
loader   -> djcelery.loaders.DjangoLoader
settings -> transport:mongodb results:mongodb

When running:

python manage.py celery worker --loglevel=info --concurrency=4 --maxtasksperchild=10

But it does not happen when run as:

python manage.py celery worker --loglevel=info --concurrency=1 --maxtasksperchild=10

or:

python manage.py celery worker --loglevel=info --concurrency=4 --maxtasksperchild=1
mitar commented 12 years ago

OK, it seems that with --concurrency=4 --maxtasksperchild=10 errors are more common, but they also happen with --concurrency=4 --maxtasksperchild=1.

9thbit commented 12 years ago

I am also seeing this error with the latest stable release where celery crashes after a few hours of long running tasks. I'm seeing this happen after an hour long job has timed out.

[2012-10-22 03:22:13,673: WARNING/PoolWorker-1] Process PoolWorker-1: [2012-10-22 03:22:13,674: WARNING/PoolWorker-1] Traceback (most recent call last): [2012-10-22 03:22:13,675: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/process.py", line 248, in _bootstrap [2012-10-22 03:22:13,676: WARNING/PoolWorker-1] self.run() [2012-10-22 03:22:13,677: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/process.py", line 97, in run [2012-10-22 03:22:13,678: WARNING/PoolWorker-1] self._target(_self._args, *_self._kwargs) [2012-10-22 03:22:13,678: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/pool.py", line 308, in worker [2012-10-22 03:22:13,679: WARNING/PoolWorker-1] put((READY, (job, i, (False, einfo)))) [2012-10-22 03:22:13,679: WARNING/PoolWorker-1] File "/home/bhurley/.virtualenvs/celery/lib/python2.6/site-packages/billiard-2.7.3.17-py2.6-linux-x86_64.egg/billiard/queues.py", line 352, in put [2012-10-22 03:22:13,680: WARNING/PoolWorker-1] return send(obj) [2012-10-22 03:22:13,680: WARNING/PoolWorker-1] IOError: [Errno 32] Broken pipe

software -> celery:3.0.11 (Chiastic Slide) kombu:2.4.7 py:2.6.6 billiard:2.7.3.17 amqplib:N/A platform -> system:Linux arch:64bit, ELF imp:CPython loader -> celery.loaders.default.Loader settings -> transport:amqp results:disabled

BROKER_URL: 'amqp://something/myvhost' CELERYD_CONCURRENCY: 6 CELERY_IMPORTS: ('celery_tasks',) CELERYD_PREFETCH_MULTIPLIER: 1

ask commented 12 years ago

@9thbit How did it time out? And do you think it's relevant?

9thbit commented 12 years ago

@ask I was simply monitoring the tasks using top and just based on how long the job was running, it seemed to crash after one of the tasks exceeded time_limit.

I have added CELERYD_FORCE_EXECV = True to my config and I'll report back if that resolved the issue. Also, if it is related to using python 2.6 -- which is on the cluster I am using -- I can try 2.7.

mitar commented 12 years ago

I am using 2.7 and I am getting this.

spulec commented 12 years ago

Python 2.7 here with CELERYD_FORCE_EXECV = True.

davepeck commented 11 years ago

Seeing this too with --concurrency=4 and kombu. Ugh.

noirbizarre commented 11 years ago

Same issue for us. It's not stable at all in production!

sylvinus commented 11 years ago

Same issue here.

ask commented 11 years ago

Could you please include the version your are using?

Also what broker are you using? (kombu is not a transport, it's the messaging framework we use)

The latest celery version disabled force_execv by default, it could be a culprit.

ask commented 11 years ago

If you're using RabbitMQ/redis, could you please try running with CELERY_DISABLE_RATE_LIMITS=True ?

jdp commented 11 years ago

I'm are also seeing this crash. I also tried the CELERY_DISABLE_RATE_LIMITS=True suggestion but that didn't fix the error.

Unrecoverable error: IOError(32, 'Broken pipe')

Stacktrace (most recent call last):

  File "celery/worker/__init__.py", line 351, in start
    component.start()
  File "celery/worker/consumer.py", line 393, in start
    self.consume_messages()
  File "celery/worker/consumer.py", line 483, in consume_messages
    handlermap[fileno](fileno, event)
  File "billiard/pool.py", line 1039, in maintain_pool
    self._maintain_pool()
  File "billiard/pool.py", line 1034, in _maintain_pool
    self._repopulate_pool(self._join_exited_workers())
  File "billiard/pool.py", line 1020, in _repopulate_pool
    self._create_worker_process(self._avail_index())
  File "billiard/pool.py", line 904, in _create_worker_process
    w.start()
  File "billiard/process.py", line 120, in start
    self._popen = Popen(self)
  File "billiard/forking.py", line 192, in __init__
    code = process_obj._bootstrap()
  File "billiard/process.py", line 276, in _bootstrap
    sys.stdout.flush()

Here's our sys.argv:

['manage.py', 'celeryd', '--loglevel=INFO', '-c', '4', '--queues=celery', '--maxtasksperchild=100']
ask commented 11 years ago

@jdp what celery/billiard version is this? please include the output of celery report (but make sure to remove sensitive information)

ask commented 11 years ago

If this happens while flushing stdout, then it may be worth a try to simply ignore this error. It does not have a stdout at this point anyway.

ask commented 11 years ago

Closing this issue now, please open a new issue if this happen again. Chances are it's not the same issue as the previous one.

mengmengpengpeng commented 3 years ago

I met the same problem. Traceback (most recent call last): File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 292, in call sys.exit(self.workloop(pid=pid)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 374, in workloop put((READY, (job, i, (False, einfo), inqW_fd))) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 366, in put self.send_payload(ForkingPickler.dumps(obj)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 403, in send_payload self._writer.send_bytes(value) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 227, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 453, in _send_bytes self._send(header + buf) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 406, in _send n = write(self._handle, buf) OSError: [Errno 32] Broken pipe

I got this.It confused me for one week..

auvipy commented 3 years ago

I met the same problem. Traceback (most recent call last): File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 292, in call sys.exit(self.workloop(pid=pid)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/pool.py", line 374, in workloop put((READY, (job, i, (False, einfo), inqW_fd))) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 366, in put self.send_payload(ForkingPickler.dumps(obj)) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/queues.py", line 403, in send_payload self._writer.send_bytes(value) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 227, in send_bytes self._send_bytes(m[offset:offset + size]) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 453, in _send_bytes self._send(header + buf) File "/home/work/software/python_package/lib/python2.7/site-packages/billiard/connection.py", line 406, in _send n = write(self._handle, buf) OSError: [Errno 32] Broken pipe

I got this.It confused me for one week..

which version are you using?