sangoma / switchy

async FreeSWITCH cluster control
https://switchy.readthedocs.io/en/latest/
Mozilla Public License 2.0
69 stars 18 forks source link

Originator app is not tolerant of transient connection failures #3

Closed goodboy closed 9 years ago

goodboy commented 9 years ago

When load testing using a cluster if a single slave drops the esl connection (eg. mod_event_socket is reloaded) then the originator burst loop will exit with the following:

ar 10 15:15:36 [ERROR] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:375 : exiting burst loop due to exception:
Traceback (most recent call last):
  File "/home/tyler/repos/switchy/switchy/apps/call_gen.py", line 366, in _serve_forever
    self.sched.run()
  File "/usr/lib/python2.7/sched.py", line 117, in run
    action(*argument)
  File "/home/tyler/repos/switchy/switchy/apps/call_gen.py", line 326, in _burst
    uuid_func=self.uuid_gen
  File "/home/tyler/repos/switchy/switchy/observe.py", line 1201, in originate
    **bgapi_kwargs)
  File "/home/tyler/repos/switchy/switchy/observe.py", line 1156, in bgapi
    raise ConnectionError("local connection down!?")
ConnectionError: local connection down!?
Mar 10 15:15:36 [INFO] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:376 : stopping burst loop...
Mar 10 15:15:36 [INFO] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:345 : Waiting for start command...
Mar 10 15:15:36 [WARNING] switchy observe.py:50 : Event 'SERVER_DISCONNECTED' has no timestamp!?
Mar 10 15:15:36 [INFO] switchy.EventListener@vm-host.qa.sangoma.local observe.py:316 : Disconnected listener '8b44c92a-c761-11e4-8079-74d02bc595d7' from 'vm-host.qa.sangoma.local'
Mar 10 15:15:36 [WARNING] switchy.EventListener@vm-host.qa.sangoma.local observe.py:668 : handling DISCONNECT from server 'vm-host.qa.sangoma.local'
Mar 10 15:15:36 [INFO] switchy.EventListener@vm-host.qa.sangoma.local observe.py:348 : Connected listener '8b44c92a-c761-11e4-8079-74d02bc595d7' to 'vm-host.qa.sangoma.local'

The connections are actually recovered thanks to the "SERVER_DISCONNECTED" handler in switchy.EventListener, however, the Originator burst loop terminates and never changes state remaining in the "ORIGINATING" state when the burst loop is actually fully stopped.

So there are two issues to resolve: