hipchat / hubot-hipchat

HipChat adapter for GitHub's Hubot
https://www.hipchat.com/help/page/bots
MIT License
666 stars 252 forks source link

Exit when system-shutdown recevied #162

Closed sorah closed 10 years ago

sorah commented 10 years ago

Recently our hubot received system-shutdown error and hubot process stopped working.

[Sat Feb 08 2014 10:10:16 GMT+0900 (JST)] ERROR 'system-shutdown', 'No error text sent by HipChat, see        http://xmpp.org/rfcs/rfc6120.html#streams-error-conditions        for error condition descriptions.', { name: 'stream:error',

I guess hubot-hipchat should handle system-shutdown to exit its process like this: https://github.com/markstory/hubot-xmpp/blob/master/src/xmpp.coffee#L58-L62

Because this behavior is useful to monitor from outside (using Zabbix, Nagios, etc.)

Thoughts?

powdahound commented 10 years ago

Not sure why it says "No error text sent by HipChat" - we definitely send a message with that condition.

Either way, it's a normal condition that's sent when we're rebooting services. The correct hubot-hipchat behavior would be to wait a random amount of seconds (5-20) and then reconnect.

holic commented 10 years ago

Just saw this as well:

2014-03-08T19:49:45.819199+00:00 app[web.1]: [Sun Mar 09 2014 08:49:45 GMT+1300 (NZDT)] INFO Disconnected from hipchat.com
2014-03-08T19:49:45.822364+00:00 app[web.1]: [Sun Mar 09 2014 08:49:45 GMT+1300 (NZDT)] ERROR 'system-shutdown', 'No error text sent by HipChat, see        http://xmpp.org/rfcs/rfc6120.html#streams-error-conditions        for error condition descriptions.', { name: 'stream:error',
2014-03-08T19:49:45.822364+00:00 app[web.1]:   parent: null,
2014-03-08T19:49:45.822364+00:00 app[web.1]:   attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' },
2014-03-08T19:49:45.822364+00:00 app[web.1]:   children:
2014-03-08T19:49:45.822364+00:00 app[web.1]:    [ { name: 'system-shutdown',
2014-03-08T19:49:45.822364+00:00 app[web.1]:        parent: [Circular],
2014-03-08T19:49:45.822364+00:00 app[web.1]:        attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' },
2014-03-08T19:49:45.822364+00:00 app[web.1]:        children: [] } ] }
wormzer commented 10 years ago

Same here about 4 minutes later and we're seeing no attempt to reconnect (but with "ERROR, null, null" in the debug out):

Mar  8 19:53:43 ip-10-244-36-143 [hubot]: [Sat Mar 08 2014 11:53:43 GMT-0800 (PST)] INFO Disconnected from hipchat.com
Mar  8 19:53:43 ip-10-244-36-143 [hubot]: [Sat Mar 08 2014 11:53:43 GMT-0800 (PST)] ERROR null, null, { name: 'stream:error',
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:   parent: null,
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:   attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' },
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:   children:
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:    [ { name: 'system-shutdown',
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:        parent: [Circular],
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:        attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' },
Mar  8 19:53:43 ip-10-244-36-143 [hubot]:        children: [] } ] }
minglecm commented 10 years ago

Same issue happened for us:

2014-04-08T00:07:53.461831+00:00 app[web.1]: [Tue Apr 08 2014 00:07:53 GMT+0000 (UTC)] INFO Disconnected from hipchat.com
2014-04-08T00:07:53.464019+00:00 app[web.1]: [Tue Apr 08 2014 00:07:53 GMT+0000 (UTC)] ERROR 'system-shutdown', 'No error text sent by HipChat, see        http://xmpp.org/rfcs/rfc6120.html#streams-error-conditions        for error condition descriptions.', { name: 'stream:error',
2014-04-08T00:07:53.464019+00:00 app[web.1]:   parent: null,
2014-04-08T00:07:53.464019+00:00 app[web.1]:   attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' },
2014-04-08T00:07:53.464019+00:00 app[web.1]:   children:
2014-04-08T00:07:53.464019+00:00 app[web.1]:    [ { name: 'system-shutdown',
2014-04-08T00:07:53.464019+00:00 app[web.1]:        parent: [Circular],
2014-04-08T00:07:53.464019+00:00 app[web.1]:        attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' },
2014-04-08T00:07:53.464019+00:00 app[web.1]:        children: [] } ] }

I'm going to try and make a PR that attempts to reconnect a max amount of times (probably a max of 5 times every 20 seconds or something before giving up).

ralish commented 10 years ago

Confirm we're seeing the same. Output from moments ago:

[Thu May 08 2014 06:00:32 GMT+1000 (EST)] INFO Disconnected from hipchat.com
[Thu May 08 2014 06:00:32 GMT+1000 (EST)] ERROR null, null, { name: 'stream:error',
  parent: null,
  attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' },
  children:
   [ { name: 'system-shutdown',
       parent: [Circular],
       attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' },
       children: [] } ] }

No attempt at reconnecting is made (the bot must be manually restarted).

AgentO3 commented 10 years ago

This is the second time today we are getting this error. I'm looking for some way to have Hubot process to automatically exit when it gets this or other similar stream disconnection messages. Any ideas how I can do that?

canadianveggie commented 10 years ago

+1. We saw the same dropped connection at Wed May 07 2014 20:00:32 (UTC).

stanhu commented 10 years ago

+1

thomaschaaf commented 10 years ago

Wed May 07 2014 19:48:25 GMT+0000 (UTC)] INFO Disconnected from hipchat.com [Wed May 07 2014 19:48:25 GMT+0000 (UTC)] ERROR null, null, { parent: null, name: 'stream:error', children: [ { parent: [Circular], name: 'system-shutdown', children: [ [length]: 0 ], attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' } }, [length]: 1 ], attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' } } [Wed May 07 2014 19:55:53 GMT+0000 (UTC)] INFO Connecting HipChat adapter... [Wed May 07 2014 19:55:55 GMT+0000 (UTC)] INFO Connected to hipchat.com as @Hugo [Wed May 07 2014 19:55:55 GMT+0000 (UTC)] WARNING The HUBOT_AUTH_ADMIN environment variable not set [Wed May 07 2014 19:55:55 GMT+0000 (UTC)] INFO Joining 1xxx@conf.hipchat.com [Wed May 07 2014 19:55:55 GMT+0000 (UTC)] INFO /opt/hubot/scripts/tracker.coffee is using deprecated documentation syntax [Wed May 07 2014 19:55:58 GMT+0000 (UTC)] INFO Data for brain retrieved from Redis [Wed May 07 2014 20:20:47 GMT+0000 (UTC)] INFO Disconnected from hipchat.com [Wed May 07 2014 20:20:47 GMT+0000 (UTC)] ERROR null, null, { parent: null, name: 'stream:error', children: [ { parent: [Circular], name: 'system-shutdown', children: [ [length]: 0 ], attrs: { xmlns: 'urn:ietf:params:xml:ns:xmpp-streams' } }, [length]: 1 ], attrs: { 'xmlns:stream': 'http://etherx.jabber.org/streams' } }

despairblue commented 10 years ago

+1

bayleedev commented 10 years ago

:+1: fixes are better than monit processes...

buddylindsey commented 10 years ago

how are you all getting error's? I can't seem to get an error when hubot disconnects. It just does. I have HUBOT_LOG_LEVEL set to debug, still no error.

steeef commented 10 years ago

Typically I see Hubot disconnect for no reason given at all. Hubot seems to think it's still connected, but it's noticeably absent from any rooms, with offline status. I'd say the "system-shutdown" error gets logged maybe one out of every 20 times.

buddylindsey commented 10 years ago

Okay, that is what I am seeing. Just wanted to be sure it wasn't a separate issue, so I fall into the category of this bug as well.

powdahound commented 10 years ago

There should be a disconnect event fired by node-xmpp at some level which could be watched and used to trigger a reconnection (after a brief wait). If someone has time to investigate and submit a PR we'd be happy to merge that in.

buddylindsey commented 10 years ago

So here (https://github.com/hipchat/hubot-hipchat/blob/master/src/connector.coffee#L85) are you saying there should be something like:

@jabber.on "disconnect", bind(onDisconnect, @)

Then the onDisconnect does something like:

onDisconnect = ->
  # call onOnline to reconnect(?)

This is supposition on my part. I have not messed enough with node or xmpp to have any idea.

stanhu commented 10 years ago

I just installed ejabberd and pointed the HipChat adapter to the server. If I shut down ejabberd, the current version of the HipChat adapter never detects that the server has gone away.

With the pull request I added, a few more events get handled:

offline is emitted by Client (https://github.com/node-xmpp/node-xmpp-client/blob/v0.1.4/index.js#L127) close is emitted by Session, which Client inherits (https://github.com/node-xmpp/node-xmpp-client/blob/v0.1.4/lib/session.js#L149)

Note that the latest version of xmpp-client has a reconnect option that could handle the reconnection transparently. Upgrading the dependencies might be another way to get this auto-reconnection feature.

AgentO3 commented 10 years ago

I think reconnecting is also a good option.

powdahound commented 10 years ago

Fixed in #174