StackStorm / hubot-stackstorm

Hubot plugin for integration with StackStorm event-driven infrastructure automation platform.
Apache License 2.0
49 stars 39 forks source link

st2chatops is not reconnecting when auth token expires #157

Closed nmaludy closed 5 years ago

nmaludy commented 6 years ago

After a period of time st2chatops auth token expires and an error is seen in the logs:

$ journalctl -u st2chatops | grep ERROR
Apr 29 03:34:24 stackstorm.domain.tld hubot[108611]: [Sun Apr 29 2018 03:34:24 GMT-0400 (EDT)] ERROR Stream error: Event { type: 'error' }
Apr 29 03:34:26 stackstorm.domain.tld hubot[108611]: [Sun Apr 29 2018 03:34:26 GMT-0400 (EDT)] ERROR Stream error: Event { type: 'error', status: 401 }

After this, the bot responds with ACKs but any chatops.post_messages are not echo'd into ChatOps because it is no longer able to listen on the stream.

For more details and explanation please see the original issue here: https://github.com/StackStorm/st2/issues/4119

nmaludy commented 6 years ago

It looks like the line of code that's printing that to the logs is here: https://github.com/StackStorm/hubot-stackstorm/blob/998141c83c251fe9eee1cb0eeb2e9c21e377691d/scripts/stackstorm.js#L382

Maybe there needs to be some retry logic in there to handle a 401 and re-auth?

NikonNLG commented 6 years ago

@nmaludy why you don't generate api key and use it without expiration? st2 apikey and export ST2_API_KEY should help you

shreyas-shinde commented 6 years ago

I have seen that error even with ST2_API_KEY . In this case there is not even "status: 401" [Mon Jun 25 2018 09:54:50 GMT+0000 (UTC)] INFO 46 commands are loaded [Mon Jun 25 2018 09:55:01 GMT+0000 (UTC)] ERROR Stream error: Event { type: 'error' } [Mon Jun 25 2018 09:55:32 GMT+0000 (UTC)] ERROR Stream error: Event { type: 'error' } [Mon Jun 25 2018 09:55:45 GMT+0000 (UTC)] ERROR Stream error: Event { type: 'error' } [Mon Jun 25 2018 09:56:10 GMT+0000 (UTC)] ERROR Stream error: Event { type: 'error' } [Mon Jun 25 2018 09:56:50 GMT+0000 (UTC)] INFO Loading commands....

I see above error but it still works and no idea about reason behind above message. But after some time (around 5 days) it stops to respond. Only ack is responded. st2 is fully functional as i checked from web. So issue is with st2chatops.

arm4b commented 5 years ago

I've set the token_ttl in st2.conf to some short time and couldn't 100% reproduce the bug with st2 v3.0.1.

I've found that sometimes token re-request failed, however bot was successfully able to recover itself after a couple of token retries (happens every 2mins by default):

[Thu Jun 20 2019 19:52:47 GMT+0000 (Coordinated Universal Time)] INFO Requesting a token...
[Thu Jun 20 2019 19:52:47 GMT+0000 (Coordinated Universal Time)] INFO Loading commands....
[Thu Jun 20 2019 19:52:47 GMT+0000 (Coordinated Universal Time)] INFO Token received. Expiring 2019-06-20T19:54:47.308223Z
[Thu Jun 20 2019 19:52:47 GMT+0000 (Coordinated Universal Time)] ERROR Failed to retrieve commands from "https://localhost/api": Unauthorized - Token has expired.
[Thu Jun 20 2019 19:54:47 GMT+0000 (Coordinated Universal Time)] INFO Loading commands....
[Thu Jun 20 2019 19:54:47 GMT+0000 (Coordinated Universal Time)] INFO Requesting a token...
[Thu Jun 20 2019 19:54:47 GMT+0000 (Coordinated Universal Time)] ERROR Failed to retrieve commands from "https://localhost/api": Unauthorized - Token has expired.
[Thu Jun 20 2019 19:54:47 GMT+0000 (Coordinated Universal Time)] INFO Token received. Expiring 2019-06-20T19:56:47.316771Z
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] INFO Loading commands....
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] INFO Requesting a token...
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack get <pack> - Get information about installed StackStorm pack.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack get {{ pack }} - Get information about installed StackStorm pack.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack install <pack_name|git_url>[,<pack_name|git_url>] - Install/upgrade StackStorm packs.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack install {{ packs }} - Install/upgrade StackStorm packs.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack search <query> - Search for packs in StackStorm Exchange and other directories.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack search {{ query }} - Search for packs in StackStorm Exchange and other directories.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack show <pack> - Show information about the pack from StackStorm Exchange.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: pack show {{ pack }} - Show information about the pack from StackStorm Exchange.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} actions - List available StackStorm actions.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} actions from {{ pack }} - List available StackStorm actions.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 get execution {{ id }} - Retrieve details for a single execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 show execution {{ id }} - Retrieve details for a single execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 executions get {{ id }} - Retrieve details for a single execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 executions show {{ id }} - Retrieve details for a single execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} executions - List available StackStorm executions.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} executions with status {{ status }} - List available StackStorm executions.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} executions for action {{ action }} - List available StackStorm executions.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 re-run execution {{ id }} - Re-run an action execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 executions re-run {{ id }} - Re-run an action execution.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 respond to inquiry {{ id }} with {{ response }} - Respond to an Inquiry
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} rules - List available StackStorm rules.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} rules from {{ pack }} - List available StackStorm rules.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} sensors - List available StackStorm sensors.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] DEBUG Added command: st2 list {{ limit=10 }} sensors from {{ pack }} - List available StackStorm sensors.
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] INFO 20 commands are loaded
[Thu Jun 20 2019 19:56:47 GMT+0000 (Coordinated Universal Time)] INFO Token received. Expiring 2019-06-20T19:58:47.329763Z

There is a diff in my log, you can find there that I didn't experience Stream errors as it shows in original issue report.

arm4b commented 5 years ago

Could reproduce it now with the following logic: