project-koku / masu

This is a READ ONLY repo. See https://github.com/project-koku/koku for current masu implementation
GNU Affero General Public License v3.0
5 stars 6 forks source link

Celery connection reset by peer error #160

Closed dccurtis closed 5 years ago

dccurtis commented 6 years ago

While processing CUR data I have seen the following backtrace from time to time:

[2018-08-23 18:54:59,883: WARNING/ForkPoolWorker-4] [2018-08-23 18:54:59,812] INFO in report_processor: Saving report rows 1300000 to 1400000
[2018-08-23 18:54:59,812: INFO/ForkPoolWorker-4] Saving report rows 1300000 to 1400000
[2018-08-23 18:55:26,786: WARNING/ForkPoolWorker-5] [2018-08-23 18:55:26,774] INFO in report_processor: Saving report rows 1400000 to 1500000
[2018-08-23 18:55:26,774: INFO/ForkPoolWorker-5] Saving report rows 1400000 to 1500000
[2018-08-23 18:59:41,277: ERROR/MainProcess] Control command error: ConnectionResetError(104, 'Connection reset by peer')
Traceback (most recent call last):
  File "/opt/app-root/lib/python3.6/site-packages/celery/worker/pidbox.py", line 46, in on_message
    self.node.handle_message(body, message)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/pidbox.py", line 129, in handle_message
    return self.dispatch(**body)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/pidbox.py", line 112, in dispatch
    ticket=ticket)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/pidbox.py", line 135, in reply
    serializer=self.mailbox.serializer)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/pidbox.py", line 265, in _publish_reply
    **opts
  File "/opt/app-root/lib/python3.6/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/opt/app-root/lib/python3.6/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/opt/app-root/lib/python3.6/site-packages/kombu/messaging.py", line 194, in <listcomp>
    [maybe_declare(entity) for entity in declare]
  File "/opt/app-root/lib/python3.6/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/common.py", line 129, in maybe_declare
    return _maybe_declare(entity, declared, ident, channel, orig)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/common.py", line 135, in _maybe_declare
    entity.declare(channel=channel)
  File "/opt/app-root/lib/python3.6/site-packages/kombu/entity.py", line 185, in declare
    nowait=nowait, passive=passive,
  File "/opt/app-root/lib/python3.6/site-packages/amqp/channel.py", line 614, in exchange_declare
    wait=None if nowait else spec.Exchange.DeclareOk,
  File "/opt/app-root/lib/python3.6/site-packages/amqp/abstract_channel.py", line 50, in send_method
    conn.frame_writer(1, self.channel_id, sig, args, content)
  File "/opt/app-root/lib/python3.6/site-packages/amqp/method_framing.py", line 166, in write_frame
    write(view[:offset])
  File "/opt/app-root/lib/python3.6/site-packages/amqp/transport.py", line 275, in write
    self._write(s)
ConnectionResetError: [Errno 104] Connection reset by peer
[2018-08-23 18:59:51,923: WARNING/ForkPoolWorker-3] [2018-08-23 18:59:51,806] INFO in report_processor: Saving report rows 1500000 to 1600000
[2018-08-23 18:59:51,806: INFO/ForkPoolWorker-3] Saving report rows 1500000 to 1600000
[2018-08-23 19:00:29,611: WARNING/ForkPoolWorker-2] [2018-08-23 19:00:29,360] INFO in report_processor: Saving report rows 900000 to 1000000
[2018-08-23 19:00:29,360: INFO/ForkPoolWorker-2] Saving report rows 900000 to 1000000
[2018-08-23 19:00:30,320: WARNING/ForkPoolWorker-4] [2018-08-23 19:00:30,268] INFO in report_processor: Saving report rows 1400000 to 1500000
[2018-08-23 19:00:30,268: INFO/ForkPoolWorker-4] Saving report rows 1400000 to 1500000
[2018-08-23 19:00:38,385: WARNING/ForkPoolWorker-5] [2018-08-23 19:00:38,280] INFO in report_processor: Saving report rows 1500000 to 1600000
[2018-08-23 19:00:38,280: INFO/ForkPoolWorker-5] Saving report rows 1500000 to 1600000

Could be related to this https://github.com/celery/celery/issues/4226. Looks like a workaround is to set broker_pool_limit = None

chambridge commented 6 years ago

dup? https://github.com/project-koku/masu/issues/155

Or maybe just related.

thedrow commented 6 years ago

FYI, we found a solution but we need a PR. See https://github.com/celery/celery/issues/4226#issuecomment-423121781

dccurtis commented 5 years ago

Issue has not been seen recently. Closing