When load testing using a cluster if a single slave drops the esl connection (eg. mod_event_socket is reloaded) then the originator burst loop will exit with the following:
ar 10 15:15:36 [ERROR] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:375 : exiting burst loop due to exception:
Traceback (most recent call last):
File "/home/tyler/repos/switchy/switchy/apps/call_gen.py", line 366, in _serve_forever
self.sched.run()
File "/usr/lib/python2.7/sched.py", line 117, in run
action(*argument)
File "/home/tyler/repos/switchy/switchy/apps/call_gen.py", line 326, in _burst
uuid_func=self.uuid_gen
File "/home/tyler/repos/switchy/switchy/observe.py", line 1201, in originate
**bgapi_kwargs)
File "/home/tyler/repos/switchy/switchy/observe.py", line 1156, in bgapi
raise ConnectionError("local connection down!?")
ConnectionError: local connection down!?
Mar 10 15:15:36 [INFO] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:376 : stopping burst loop...
Mar 10 15:15:36 [INFO] switchy.Originator@['vm-host.qa.sangoma.local', 'sip-cannon.qa.sangoma.local'] call_gen.py:345 : Waiting for start command...
Mar 10 15:15:36 [WARNING] switchy observe.py:50 : Event 'SERVER_DISCONNECTED' has no timestamp!?
Mar 10 15:15:36 [INFO] switchy.EventListener@vm-host.qa.sangoma.local observe.py:316 : Disconnected listener '8b44c92a-c761-11e4-8079-74d02bc595d7' from 'vm-host.qa.sangoma.local'
Mar 10 15:15:36 [WARNING] switchy.EventListener@vm-host.qa.sangoma.local observe.py:668 : handling DISCONNECT from server 'vm-host.qa.sangoma.local'
Mar 10 15:15:36 [INFO] switchy.EventListener@vm-host.qa.sangoma.local observe.py:348 : Connected listener '8b44c92a-c761-11e4-8079-74d02bc595d7' to 'vm-host.qa.sangoma.local'
The connections are actually recovered thanks to the "SERVER_DISCONNECTED" handler in switchy.EventListener, however, the Originator burst loop terminates and never changes state remaining in the "ORIGINATING" state when the burst loop is actually fully stopped.
So there are two issues to resolve:
when an exception occurrs the app should immediately enter the "STOPPED" state
a mechanism should be implemented to handle reasonably transient connection failures
such that originator state is not clobbered as far as listener state tracking capabilities allow.
When load testing using a cluster if a single slave drops the esl connection (eg.
mod_event_socket
is reloaded) then the originator burst loop will exit with the following:The connections are actually recovered thanks to the
"SERVER_DISCONNECTED"
handler inswitchy.EventListener
, however, theOriginator
burst loop terminates and never changes state remaining in the"ORIGINATING"
state when the burst loop is actually fully stopped.So there are two issues to resolve:
"STOPPED"
state