xmppjs / hubot-xmpp

XMPP adapter for Hubot
180 stars 103 forks source link

Hubot does not reconnect on disconnect #58

Closed bnied closed 10 years ago

bnied commented 10 years ago

Hubot does not reconnect when it receives an offline event from the server:

[Tue Feb 18 2014 14:45:50 GMT-0500 (EST)] DEBUG Loading adapter xmpp [Tue Feb 18 2014 14:45:50 GMT-0500 (EST)] INFO { username: 'hubot@not.a.real.site', password: '****', host: 'conference.not.a.real.site', port: '5222', rooms: [ { jid: 'test@conference.not.a.real.site', password: false } ], keepaliveInterval: 30000, legacySSL: undefined, preferredSaslMechanism: undefined, disallowTLS: undefined } [Tue Feb 18 2014 14:45:50 GMT-0500 (EST)] DEBUG Received offline event

bnied commented 10 years ago

using hubot 2.7.1 and hubot-xmpp 0.1.9.

bnied commented 10 years ago

Is there an update on this? Am I doing something wrong (i.e., is there another config var I missed to enable restarts on disconnects)?

markstory commented 10 years ago

I'm not sure, the bots I run reconnect properly.

bnied commented 10 years ago

Hrm. Are there any known issues with the versions of either hubot-xmpp or hubot that I listed?

bnied commented 10 years ago

Update: just tested reverting hubot-xmpp back to 0.1.8; that version reconnects after the offline event is received.

markstory commented 10 years ago

Thanks for looking into that I can take a look at what has changed between those versions.

markstory commented 10 years ago

I'm still not sure why hubot is disconnecting so fast. I thought it might be related to timeouts. But I don't think it is. I configured my ejabberd setup to use pings via mod_ping, and hubot-xmpp was able to keep the connection active via pings.

$ bin/hubot -a xmpp -n ernie -l'!'
[Mon Feb 24 2014 21:01:11 GMT-0500 (EST)] DEBUG Loading adapter xmpp
Cannot load StringPrep-0.1.0 bindings. You may need to `npm install node-stringprep'
[Mon Feb 24 2014 21:01:11 GMT-0500 (EST)] INFO { username: 'ernie@localhost',
  password: '********',
  host: 'localhost',
  port: undefined,
  rooms: [ { jid: 'bot_test@conference.localhost', password: false } ],
  keepaliveInterval: 30000,
  legacySSL: undefined,
  preferredSaslMechanism: undefined,
  disallowTLS: undefined }
[Mon Feb 24 2014 21:01:12 GMT-0500 (EST)] INFO Hubot XMPP client online
[Mon Feb 24 2014 21:01:12 GMT-0500 (EST)] INFO Hubot XMPP sent initial presence
[Mon Feb 24 2014 21:01:12 GMT-0500 (EST)] DEBUG Joining bot_test@conference.localhost/ernie
... script loading elided ...
[Mon Feb 24 2014 21:01:17 GMT-0500 (EST)] DEBUG Available received from bot_test@conference.localhost/Mark Story in room bot_test@conference.localhost and private chat jid is mark@localhost/Marks-MacBook-Pro
[Mon Feb 24 2014 21:01:22 GMT-0500 (EST)] DEBUG [received iq] <iq from="localhost" to="ernie@localhost/34795478661393293672140195" id="1590131927" type="get" xmlns:stream="http://etherx.jabber.org/streams"><ping xmlns="urn:xmpp:ping"/></iq>
[Mon Feb 24 2014 21:01:22 GMT-0500 (EST)] DEBUG [sending pong] <iq to="localhost" from="ernie@localhost/34795478661393293672140195" type="result" id="1590131927"/>
[Mon Feb 24 2014 21:01:32 GMT-0500 (EST)] DEBUG [received iq] <iq from="localhost" to="ernie@localhost/34795478661393293672140195" id="1109444778" type="get" xmlns:stream="http://etherx.jabber.org/streams"><ping xmlns="urn:xmpp:ping"/></iq>
[Mon Feb 24 2014 21:01:32 GMT-0500 (EST)] DEBUG [sending pong] <iq to="localhost" from="ernie@localhost/34795478661393293672140195" type="result" id="1109444778"/>
[Mon Feb 24 2014 21:01:42 GMT-0500 (EST)] DEBUG [received iq] <iq from="localhost" to="ernie@localhost/34795478661393293672140195" id="3143568283" type="get" xmlns:stream="http://etherx.jabber.org/streams"><ping xmlns="urn:xmpp:ping"/></iq>
[Mon Feb 24 2014 21:01:42 GMT-0500 (EST)] DEBUG [sending pong] <iq to="localhost" from="ernie@localhost/34795478661393293672140195" type="result" id="3143568283"/>
bnied commented 10 years ago

Our Hubot also keeps the connection alive via pings; however, every so often, Slack resets their gateway and all clients are expected to reconnect. This is where Hubot gets the offline message and never reconnects.

markstory commented 10 years ago

Interesting, the offline event handling hasn't changed since v0.1.8, I wonder how it worked previously.

Jaglag commented 10 years ago

i going to try 0.1.8 now too

Jaglag commented 10 years ago

ok i let him running now over 20hours and he dont die and stay alive!!! awesome with 0.1.8 the weird thing i dont see a single keepalive ping message in my debuglog! maybe the server kick us cause of spamming ping to him?

markstory commented 10 years ago

Well one change in 0.1.9 was that empty packets are not sent, and instead the library uses the connection setKeepAlive options. Since those seem to be problematic I'll revert those changes.

chrisjaure commented 10 years ago

I can confirm that my bot receives an offline event just a few minutes after connecting with 0.1.10, but the issue does not occur in 1.1.8. As far as I know, we're using the Openfire server.

markstory commented 10 years ago

Are you using SSL or TLS? I was wrong about the empty packet changes in 0.1.9, those were in 0.1.10. The main changes in 0.1.9 were adding the disallowTLS option, and nicknames to the bot. It seems unlikely that nicknames would cause timeout issues, so it might be related to TLS.

chrisjaure commented 10 years ago

Unfortunately I don't know much about jabber, but in my chat client, I'm connecting to port 5223 with SSL. If I specify the same port for hubot-xmpp it just hangs so I'm using port 5222 which works fine in 0.1.8. I haven't changed any TLS settings.

bnied commented 10 years ago

So, update: on 0.1.8, the server does attempt to reconnect. However, it seems to completely forget its authentication information:

Reconnect in 0 [Wed Feb 26 2014 03:45:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0 [Wed Feb 26 2014 03:46:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0 [Wed Feb 26 2014 03:47:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0 [Wed Feb 26 2014 03:48:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0 [Wed Feb 26 2014 03:49:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0 [Wed Feb 26 2014 03:50:30 GMT-0500 (EST)] ERROR XMPP authentication failure undefined Reconnect in 0

It looks like it loses all of its info, which is strange. Any thoughts?

markstory commented 10 years ago

Not really, I'm only really familiar with ejabberd which doesn't trigger these same issues.

Jaglag commented 10 years ago

just small update i have him running now on 1.8 for 2 days without any problem, only jid is undefined messages with every user someone know how to fix that?

bnied commented 10 years ago

@markstory can you confirm what version of node-xmpp you're running?

markstory commented 10 years ago

I was using 0.1.10 of hubot-xmpp which uses node-xmpp 0.10.0.

I did do some additional testing with ejabberd. When restarting ejabberd with ejabberdctl restart hubot would not correctly reconnect. From what I tracked down in the node-xmpp code a socket error (such as ECONNRESET which I was getting) will not initiate a reconnect. instead the connection will just be closed/ended.

I'm going to handle this case in hubot-xmpp and attempt to manually reconnect the bot on the 'end' event.

bnied commented 10 years ago

Cool, thanks. For now, I've made a fork of hubot-xmpp that terminates the node process on an offline event (we're running our bot via supervisord), so that fixes the issue from my end.

Once your have the code written on your side, I'll switch it back to upstream to test it out.

markstory commented 10 years ago

@bnied I have a pull request option for the changes I think will solve the issue.

bnied commented 10 years ago

@markstory looks good. I like that it terminates if it can't connect, so that you get better visibility into a connection issue.

Let me know when it's committed and I'll give it a spin.

bnied commented 10 years ago

So, I finally got a chance to use this with our dev Hubot. Some weirdness:

markstory commented 10 years ago

Thanks for giving it a spin, I'll get those duplicated messages in the logs fixed up.

markstory commented 10 years ago

I've added some code to remove the bound event listeners, that seems to solve the problem as I've not seen duplicated log messages.