Edge case found in the pinger - if the setup_calls takes ~> 10 sec, the self._connection.ioloop.start() will take >10 sec, which will crash the service with a ProbableAccessDeniedError.
Setup_calls should be fast, but if something isn't working properly 10 sec timeouts will trigger this. In my case I was pinging an unresponsive rga_interface, which waited for the timeout, and triggered this mess. Working solution is to reduce the ping_timeout, but surely there's a smarter way to handle this?
NOTE: line numbers are shifted because I added debug statements, lines 434 and 435 here correspond to 430 and 431 in current master.
2017-06-01T13:31:36[CRITICAL] dragonfly.implementations.pinger(56) -> The following services are not responding:
rga_interface
2017-06-01T13:31:36[DEBUG ] dripline.core.scheduler(76) -> scheduled sequence complete
2017-06-01T13:31:36[INFO ] dripline.core.scheduler(86) -> schedule loop started
2017-06-01T13:31:36[INFO ] dripline.core.service(419) -> startup calls complete
-----------------------------
2017-06-01T13:31:36[CRITICAL] dripline.core.service(434) -> Service <pinger_service> crashing with error message:
2017-06-01T13:31:36[ERROR ] dripline.core.service(435) -> Traceback (most recent call last):
File "/home/project8/Repos/dripline-python/dripline/core/service.py", line 431, in run
self._connection.ioloop.start()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 138, in start
self.poller.start()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 433, in start
self.poll()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 494, in poll
self._handler(fileno, event, write_only=write_only)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 322, in _handle_events
self._handle_read()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 351, in _handle_read
self._on_data_available(data)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1285, in _on_data_available
self._process_frame(frame_value)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1352, in _process_frame
if self._process_callbacks(frame_value):
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1322, in _process_callbacks
frame_value) # Args
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/callback.py", line 61, in wrapper
return function(*tuple(args), **kwargs)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/callback.py", line 92, in wrapper
return function(*args, **kwargs)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/callback.py", line 232, in process
callback(*args, **keywords)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1267, in _on_connection_tune
self._send_connection_tune_ok()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1473, in _send_connection_tune_ok
self.params.heartbeat))
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1503, in _send_method
self._send_frame(frame.Method(channel_number, method_frame))
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/connection.py", line 1490, in _send_frame
self._flush_outbound()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 77, in _flush_outbound
self.ioloop.poller.poll(write_only=True)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/select_connection.py", line 494, in poll
self._handler(fileno, event, write_only=write_only)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 318, in _handle_events
self._handle_write()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 365, in _handle_write
return self._handle_error(error)
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 302, in _handle_error
self._handle_disconnect()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 248, in _handle_disconnect
self._adapter_disconnect()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 137, in _adapter_disconnect
self._check_state_on_disconnect()
File "/home/project8/python_environments/develop_sandbox/local/lib/python2.7/site-packages/pika/adapters/base_connection.py", line 158, in _check_state_on_disconnect
raise exceptions.ProbableAccessDeniedError
ProbableAccessDeniedError
2017-06-01T13:31:36[DEBUG ] dripline.core.service(400) -> loop ended
@cclaessens has had a similar issue with the roach. I'll need to think about this more and input is welcome, but I think we need to be able to support setup calls which are not fast.
Edge case found in the pinger - if the setup_calls takes ~> 10 sec, the self._connection.ioloop.start() will take >10 sec, which will crash the service with a ProbableAccessDeniedError.
Setup_calls should be fast, but if something isn't working properly 10 sec timeouts will trigger this. In my case I was pinging an unresponsive rga_interface, which waited for the timeout, and triggered this mess. Working solution is to reduce the ping_timeout, but surely there's a smarter way to handle this?
NOTE: line numbers are shifted because I added debug statements, lines 434 and 435 here correspond to 430 and 431 in current master.