hubot-archive / hubot-pager-me

PagerDuty integration for Hubot
https://www.npmjs.com/package/hubot-pager-me
MIT License
73 stars 92 forks source link

"who's on call" broke in hubot-pager-me 2.1.14 #92

Closed augieschwer closed 8 years ago

augieschwer commented 8 years ago

Prior to hubot-pager-me 2.1.14 the sub-command "who's on call" returned the appropriate list of users and schedules, starting at 2.1.14 and beyond an empty set is returned.

➜  sbo-slack-hubot git:(augie-dev) grep pager package.json
    "hubot-pager-me": "2.1.13",
sbo-slack-hubot> sbo-slack-hubot who's on call
sbo-slack-hubot> [Thu Jul 14 2016 13:37:20 GMT-0700 (PDT)] DEBUG Message 'sbo-slack-hubot who's on call' matched regex //^\s*[@]?sbo\-slack\-hubot[:,]?\s*(?:who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?(.*?)(?:\?|$))?)/i/; listener.options = { id: null }
[Thu Jul 14 2016 13:37:20 GMT-0700 (PDT)] DEBUG Executing listener callback for Message 'sbo-slack-hubot who's on call'
* S is on call for 24/7 - https://safari.pagerduty.com/schedules#PP6
* R is on call for Client Services Day - https://foo.pagerduty.com/schedules#PAU
* C is on call for Escalation Backup - https://foo.pagerduty.com/schedules#PN1
* A is on call for A - Business Hours - https://foo.pagerduty.com/schedules#P81
* M is on call for M - Business Hours - https://foo.pagerduty.com/schedules#PZD
* CG is on call for Engineer Triage - https://foo.pagerduty.com/schedules#P2P
* C is on call for C - Business Hours - https://foo.pagerduty.com/schedules#P60
➜  sbo-slack-hubot git:(augie-dev) ✗ grep pager package.json
    "hubot-pager-me": "2.1.14",
sbo-slack-hubot> sbo-slack-hubot who's on call
sbo-slack-hubot> [Thu Jul 14 2016 13:40:46 GMT-0700 (PDT)] DEBUG Message 'sbo-slack-hubot who's on call' matched regex //^\s*[@]?sbo\-slack\-hubot[:,]?\s*(?:who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?(.*?)(?:\?|$))?)/i/; listener.options = { id: null }
[Thu Jul 14 2016 13:40:46 GMT-0700 (PDT)] DEBUG Executing listener callback for Message 'sbo-slack-hubot who's on call'
abdelazer commented 8 years ago

Here's more of a stack trace:

2016-06-13T16:58:43.826176+00:00 app[web.1]: [Mon Jun 13 2016 09:58:43 GMT-0700 (PDT)] ERROR TypeError: Cannot read property 'length' of undefined
2016-06-13T16:58:43.826209+00:00 app[web.1]:   at SlackBot.send (/app/node_modules/hubot-slack/src/slack.coffee:226:7, <js>:275:16)
2016-06-13T16:58:43.826212+00:00 app[web.1]:   at runAdapterSend (/app/node_modules/hubot/src/response.coffee:82:34, <js>:87:50)
2016-06-13T16:58:43.826213+00:00 app[web.1]:   at allDone (/app/node_modules/hubot/src/middleware.coffee:44:37, <js>:34:16)
2016-06-13T16:58:43.826213+00:00 app[web.1]:   at /app/node_modules/hubot/node_modules/async/lib/async.js:274:13
2016-06-13T16:58:43.826214+00:00 app[web.1]:   at Object.async.eachSeries (/app/node_modules/hubot/node_modules/async/lib/async.js:142:20)
2016-06-13T16:58:43.826214+00:00 app[web.1]:   at Object.async.reduce (/app/node_modules/hubot/node_modules/async/lib/async.js:268:15)
2016-06-13T16:58:43.826215+00:00 app[web.1]:   at /app/node_modules/hubot/src/middleware.coffee:49:7, <js>:37:22
2016-06-13T16:58:43.826216+00:00 app[web.1]:   at process._tickCallback (node.js:458:13)
2016-06-13T16:58:43.826216+00:00 app[web.1]: 

...which lead me to wonder about whether msg got lost (in a variable scope sense) in all of the rewrites of the who's-on-call handler in recent releases, although I couldn't figure out where exactly.

awiddersheim commented 8 years ago

Strange. Everything is working fine for me in 2.1.16. Though I just noticed @stephenyeargin never bumped the version in packages.json when he did some pull requests.

The only change in 2.1.14 was https://github.com/hubot-scripts/hubot-pager-me/commit/c6f15fa713b627f262710d02240c5b70800d1e61 (by me) as far as I can tell.

There were a few more fixes that might help in 2.1.16 that might be worth trying if you haven't already. Might be something specific to Slack. Things are working for me in the shell, Campfire and Hipchat adapters.

abdelazer commented 8 years ago

We tried 2.1.14, 2.1.15, and 2.1.16 this morning and they all had the same result (no response).

awiddersheim commented 8 years ago

That sucks. Like I said I'm not able to replicate this in other adapters and I don't have access to a Slack channel right now where I can spit PagerDuty output to. Maybe try playing with the code some in the area of the commit I linked to earlier and see where things are falling over.

abdelazer commented 8 years ago

We poked a bit on 2.1.16 and my sense is that the async.map() without results (the exact change you mentioned) is the culprit. I tried but failed to get it back to the same async.map schedules, renderSchedule, (err, results) -> sequence as is used elsewhere.

awiddersheim commented 8 years ago

You should be able to do something like a console.log(messages) before the message gets sent which should send output to the Hubot logs and you can let us know what array has.

Basically, things start off with an empty array that should eventually get populated by this bit of code.

Although looking at your error message I'm wondering if things are failing here. Doing that extra debugging should help us see that though.

awiddersheim commented 8 years ago

Maybe we should just go back to how things worked before https://github.com/hubot-scripts/hubot-pager-me/commit/ef1d84bd6078a6cb9134ac8733c61f9babd3984e. Not really sure what this was trying to solve. The commit message leads me to believe they wanted the schedules to be sent all in one message instead of individual messages but that seems to be how it worked before as well so it's pretty confusing. Here is a diff you can try:

--- a/src/scripts/pagerduty.coffee
+++ b/src/scripts/pagerduty.coffee
@@ -578,30 +578,28 @@ module.exports = (robot) ->

     scheduleName = msg.match[3] or msg.match[4]

-    messages = []
     renderSchedule = (s, cb) ->
       withCurrentOncall msg, s, (username, schedule) ->
-        messages.push("* #{username} is on call for #{schedule.name} - https://#{pagerduty.subdomain}.pagerduty.com/schedules
-        cb null
+        cb null, "* #{username} is on call for #{schedule.name} - https://#{pagerduty.subdomain}.pagerduty.com/schedules##{sc

     if scheduleName?
       withScheduleMatching msg, scheduleName, (s) ->
-        renderSchedule s, (err) ->
+        renderSchedule s, (err, result) ->
           if err?
             robot.emit 'error'
             return
-          msg.send messages.join("\n")
+          msg.send result
     else
       pagerduty.getSchedules (err, schedules) ->
         if err?
           robot.emit 'error', err, msg
           return
         if schedules.length > 0
-          async.map schedules, renderSchedule, (err) ->
+          async.map schedules, renderSchedule, (err, results) ->
             if err?
               robot.emit 'error', err, msg
               return
-            msg.send messages.join("\n")
+            msg.send results.join("\n")
         else
           msg.send 'No schedules found!'
augieschwer commented 8 years ago

I put some debug code in and you can see in the output that async.map is calling renderSchedule 11 times for each element of schedules and then just falls off the face of the earth. It’s like that call to async.map is just a blackhole. Like it might as well just say return.

  # who is on call?
  robot.respond /who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?((["'])([^]*?)\2|(.*?))(?:\?|$))?$/i, (msg) ->
    if pagerduty.missingEnvironmentForApi(msg)
      return

    scheduleName = msg.match[3] or msg.match[4]

    messages = []
    renderSchedule = (s, cb) ->
      robot.logger.debug 'renderSchedule'
      withCurrentOncall msg, s, (username, schedule) ->
        robot.logger.debug "Pushing message: * #{username} is on call for #{schedule.name} - https://#{pagerduty.subdomain}.pagerduty.com/schedules##{schedule.id}"
        messages.push("* #{username} is on call for #{schedule.name} - https://#{pagerduty.subdomain}.pagerduty.com/schedules##{schedule.id}")
        cb null

    if scheduleName?
      robot.logger.debug "scheduleName: #{schedule.name}"
      withScheduleMatching msg, scheduleName, (s) ->
        renderSchedule s, (err) ->
          if err?
            robot.emit 'error'
            return
          msg.send messages.join("\n")
    else
      robot.logger.debug 'pagerduty.getSchedules'
      pagerduty.getSchedules (err, schedules) ->
        if err?
          robot.emit 'error', err, msg
          return
        if schedules.length > 0
          robot.logger.debug "Found #{schedules.length} schedules"
          async.map schedules, renderSchedule, (err) ->
            if err?
              robot.emit 'error', err, msg
              return
            else
              robot.logger.debug "Sending #{messages}"
              msg.send messages.join("\n")
        else
          msg.send 'No schedules found!'
sbo-slack-hubot> [Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG Message 'sbo-slack-hubot who's on call' matched regex //^\s*[@]?sbo\-slack\-hubot[:,]?\s*(?:who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?((["'])([^]*?)\2|(.*?))(?:\?|$))?$)/i/; listener.options = { id: null }
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG Executing listener callback for Message 'sbo-slack-hubot who's on call'
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG pagerduty.getSchedules
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG Found 11 schedules
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG Pushing message: * D is on call for D - Business Hours - https://foo.pagerduty.com/schedules#P04
[Fri Jul 15 2016 07:07:26 GMT-0700 (PDT)] DEBUG Pushing message: * S is on call for 24/7 - https://foo.pagerduty.com/schedules#PP6
[Fri Jul 15 2016 07:07:27 GMT-0700 (PDT)] DEBUG Pushing message: * R is on call for Client Services Day - https://foo.pagerduty.com/schedules#PAU
[Fri Jul 15 2016 07:07:27 GMT-0700 (PDT)] DEBUG Pushing message: * C is on call for Escalation Backup - https://foo.pagerduty.com/schedules#PN1
stephenyeargin commented 8 years ago

Strange. Everything is working fine for me in 2.1.16. Though I just noticed @stephenyeargin never bumped the version in packages.json when he did some pull requests.

Hmm. Looks like the grunt task failed on a push of the commit/tag that created 2.1.16. That's up to date now. Should not affect the code, however -- the only missing commit was the one to bump the JSON. Presumably, what is published to npm is up to speed.

awiddersheim commented 8 years ago

@stephenyeargin thanks for getting that fixed. @augieschwer what happens if you apply that diff I posted? That should bring the code back inline with how things were working prior to 2.1.14. If that is broken then that would be pretty strange as well.

augieschwer commented 8 years ago

@awiddersheim The diff didn't help unfortunately. But I think I found where the bug is, although I don't quite understand why it's a bug. In withCurrentOncallUser the last condition is not met for all our schedules if I give it some dummy data it at least returns messages.

 withCurrentOncallUser = (msg, schedule, cb) ->
    oneHour = moment().add(1, 'hours').format()
    now = moment().format()

    scheduleId = schedule.id
    if (schedule instanceof Array && schedule[0])
      scheduleId = schedule[0].id
    unless scheduleId
      msg.send "Unable to retrieve the schedule. Use 'pager schedules' to list all schedules."
      return

    query = {
      since: now,
      until: oneHour,
      overflow: 'true'
    }
    pagerduty.get "/schedules/#{scheduleId}/entries", query, (err, json) ->
      if err?
        robot.emit 'error', err, msg
        return
      if json.entries and json.entries.length > 0
        cb(json.entries[0].user, schedule)
      else
        robot.logger.debug "No callback performed. json.entries.length: #{json.entries.length}"
        cb('foo', schedule)
sbo-slack-hubot> sbo-slack-hubot who's on call
sbo-slack-hubot> [Fri Jul 15 2016 10:27:47 GMT-0700 (PDT)] DEBUG Message 'sbo-slack-hubot who's on call' matched regex //^\s*[@]?sbo\-slack\-hubot[:,]?\s*(?:who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?((["'])([^]*?)\2|(.*?))(?:\?|$))?$)/i/; listener.options = { id: null }
[Fri Jul 15 2016 10:27:47 GMT-0700 (PDT)] DEBUG Executing listener callback for Message 'sbo-slack-hubot who's on call'
[Fri Jul 15 2016 10:27:47 GMT-0700 (PDT)] DEBUG pagerduty.getSchedules
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Found 11 schedules
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG renderSchedule
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for A - Business Hours - https://foo.pagerduty.com/schedulesP81
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for Client Services - Non WorkHours - https://foo.pagerduty.com/schedulesP1L
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for D - Business Hours - https://foo.pagerduty.com/schedulesP04
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for C - Business Hours - https://foo.pagerduty.com/schedulesP60
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * S is on call for 24/7 - https://foo.pagerduty.com/schedulesPP6
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * C  is on call for Escalation Backup - https://foo.pagerduty.com/schedulesPN1
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for M - Business Hours - https://foo.pagerduty.com/schedulesPZD
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * CG is on call for Engineer Triage - https://foo.pagerduty.com/schedulesP2P
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for S - Business Hours - https://foo.pagerduty.com/schedulesPH6
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG No callback performed. json.entries.length: 0
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * undefined is on call for J - Business Hours - https://foo.pagerduty.com/schedulesPH3
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Pushing message: * R is on call for Client Services Day - https://foo.pagerduty.com/schedulesPAU
[Fri Jul 15 2016 10:27:48 GMT-0700 (PDT)] DEBUG Sending * undefined is on call for A - Business Hours - https://foo.pagerduty.com/schedulesP81,* undefined is on call for Client Services - Non WorkHours - https://foo.pagerduty.com/schedulesP1L,* undefined is on call for D - Business Hours - https://foo.pagerduty.com/schedulesP04,* undefined is on call for C - Business Hours - https://foo.pagerduty.com/schedulesP60,* S is on call for 24/7 - https://foo.pagerduty.com/schedulesPP6,* C  is on call for Escalation Backup - https://foo.pagerduty.com/schedulesPN1,* undefined is on call for M - Business Hours - https://foo.pagerduty.com/schedulesPZD,* CG is on call for Engineer Triage - https://foo.pagerduty.com/schedulesP2P,* undefined is on call for S - Business Hours - https://foo.pagerduty.com/schedulesPH6,* undefined is on call for J - Business Hours - https://foo.pagerduty.com/schedulesPH3,* R is on call for Client Services Day - https://foo.pagerduty.com/schedulesPAU
* undefined is on call for A - Business Hours - https://foo.pagerduty.com/schedulesP81
* undefined is on call for Client Services - Non WorkHours - https://foo.pagerduty.com/schedulesP1L
* undefined is on call for D - Business Hours - https://foo.pagerduty.com/schedulesP04
* undefined is on call for C - Business Hours - https://foo.pagerduty.com/schedulesP60
* S is on call for 24/7 - https://foo.pagerduty.com/schedulesPP6
* C  is on call for Escalation Backup - https://foo.pagerduty.com/schedulesPN1
* undefined is on call for M - Business Hours - https://foo.pagerduty.com/schedulesPZD
* CG is on call for Engineer Triage - https://foo.pagerduty.com/schedulesP2P
* undefined is on call for S - Business Hours - https://foo.pagerduty.com/schedulesPH6
* undefined is on call for J - Business Hours - https://foo.pagerduty.com/schedulesPH3
* R is on call for Client Services Day - https://foo.pagerduty.com/schedulesPAU
awiddersheim commented 8 years ago

@augieschwer cool. Looks like you figured it out. Was just about to say no code changes happened in that area that I can tell.

augieschwer commented 8 years ago

The PR handles schedules without a user tied to it by returning null up the chain of call backs, and logging a single debug message to record that case.

I'm still not clear why this code seems to make async.map happy where the other code did not.

But the output looks how we expect it now.

sbo-slack-hubot> sbo-slack-hubot who's on call
sbo-slack-hubot> [Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG Message 'sbo-slack-hubot who's on call' matched regex //^\s*[@]?sbo\-slack\-hubot[:,]?\s*(?:who(?:’s|'s|s| is|se)? (?:on call|oncall|on-call)(?: (?:for )?((["'])([^]*?)\2|(.*?))(?:\?|$))?$)/i/; listener.options = { id: null }
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG Executing listener callback for Message 'sbo-slack-hubot who's on call'
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule C - Business Hours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule Client Services - Non WorkHours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule D - Business Hours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule J - Business Hours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule M - Business Hours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule A - Business Hours
[Fri Jul 15 2016 12:52:57 GMT-0700 (PDT)] DEBUG No user for schedule S - Business Hours
* S is on call for 24/7 - https://foo.pagerduty.com/schedulesPP6
* C is on call for Escalation Backup - https://foo.pagerduty.com/schedulesPN1
* R is on call for Client Services Day - https://foo.pagerduty.com/schedulesPAU
* CG is on call for Engineer Triage - https://foo.pagerduty.com/schedulesP2P

Does anyone else have schedules without users attached to them? We use these for being on-call during certain times of day, but not all day.

stephenyeargin commented 8 years ago

Fixed with v2.1.17, closed via #93

awiddersheim commented 8 years ago

@augieschwer nope. All of our schedules have someone assigned.