Open Nowaker opened 10 years ago
Works for me, I left ejabberd down for several hours, when starting back agent get registered.
Can you provide some logs also run the agent in debug mode (runarchipel -n) with xmpppy debug enabled, and try to reproduce the issue.
For the record,
The TNArchipelEntity is by default trying to reconnect every 5s without any maximum try attempts. The only way to exit the loop, it's when the user account has been removed from the server.
The only way to exit the loop, it's when the user account has been removed from the server.
So that's what actually happened.
Then started it with a wrong config.
A wrong config pointed to a different location of mnesia storage. So there were no users. And after I restarted ejabberd with a valid config, no Archipel agent tried to connect again (but it should).
In fact, ArchipelEntity describe both Hypervisor and VM. When you remove a VM, the user account is unregistered from the server and then the loop stop (as the thread).
But this shouldn't happen to a main XMPP connection? I mean hypervisor's connection.
I don't know, it could be a safety. Any way I need log to see exactly what's happenning (I don't have time to reproduce the scenario right now).
These thousands lines of logs won't tell you more than what you already said.
The only way to exit the loop, it's when the user account has been removed from the server.
Tell me if you really want these logs.
Just to confirm the point, as in the code we have:
def loop(self):
"""
This is the main loop of the client.
"""
while not self.loop_status == ARCHIPEL_XMPP_LOOP_OFF:
try:
if self.loop_status == ARCHIPEL_XMPP_LOOP_REMOVE_USER:
self.process_inband_unregistration()
return
if self.loop_status == ARCHIPEL_XMPP_LOOP_ON:
if self.xmppclient.isConnected():
if hasattr(self, "on_xmpp_loop_tick"):
self.on_xmpp_loop_tick()
self.xmppclient.Process(3)
elif self.loop_status == ARCHIPEL_XMPP_LOOP_RESTART:
if self.xmppclient.isConnected():
self.xmppclient.disconnect()
time.sleep(1.0)
self.connect()
except Exception as ex:
if str(ex).upper().find('USER REMOVED') > -1:
self.log.info("LOOP EXCEPTION: Account has been removed from server.")
self.loop_status = ARCHIPEL_XMPP_LOOP_OFF
else:
if str(ex).upper().find('SYSTEM-SHUTDOWN') > -1:
self.log.warning("LOOP EXCEPTION: The XMPP server has been shut down. Waiting 5 second for reconnection")
else:
self.log.error("LOOP EXCEPTION : Disconnected from server. Trying to reconnect in 5 seconds.")
t, v, tr = sys.exc_info()
self.log.error("TRACEBACK: %s" % "\n".join(traceback.format_exception(t, v, tr)))
self.loop_status = ARCHIPEL_XMPP_LOOP_RESTART
time.sleep(5.0)
if self.xmppclient.isConnected():
self.xmppclient.disconnect()
I've shut down ejabberd for a few minutes. Then started it with a wrong config. Then started with a good config. After I started logged in via the Client my both hypervisors were off-line. They didn't even attempt to reconnect - nothing appears in the log except for vmcasts feed and some stats refresh. I had to restart both Archipel agents to get them back.