sensu-plugins / sensu-plugins-mailer

This plugin is an email handler for Sensu.
http://sensu-plugins.io
MIT License
17 stars 37 forks source link

contact based routing only accepts 1 email address #55

Closed devney closed 7 years ago

devney commented 7 years ago

When routing alerts to specific users, SOMETIMES it uses the address configured in the check. (I got 3 of these in a row!) More often it sends alerts to the email address configured as default in the handler definition.

Contacts:

{
  "contacts": {
    "sensu-alerts": {
      "email": {
        "to": "sensu-alerts@domain.com"
      }
    },
    "userA": {
      "email": {
        "to": "userA@domain.com"
      }
    },
    "userB": {
      "email": {
        "to": "userB@domain.com"
      }
    }
  }
}

Handler definition:

{
  "mailer": {
      "admin_gui": "http://10.1.0.158:4000/",
      "delivery_method": "smtp",
      "mail_from": "sensu-alerts@domain.com",
      "mail_to": "sensu-alerts@domain.com",
      "smtp_address": "localhost",
      "smtp_port": "25",
      "smtp_domain": "domain.com"
    },

  "handlers": {
    "mailer": {
      "type": "pipe",
      "command": "handler-mailer.rb",
      "severities": ["critical"]
      }
  }
}

Check definition:

{
  "checks": {
    "swapinout": {
      "command": "check-swapins 400",
      "subscribers": [
        "memory"
      ],
      "interval": 10,
      "occurrences": 6,
      "refresh": 300,
      "handlers": [ "default", "mailer" ],
      "contact": [ "userB", "userA" ]
    }
  }
}

Note that the check definition specifies contacts userA and userB, but email is still sent to the default address specified in handler definition. ...most of the time. 3 times it was sent to the appropriate address for userA.

HOWEVER in a different check configured identically, it is consistently sending to userA.

    "disk-cs1": {
      "command": "check-disk-usage.rb -i /tmp,/mnt/2000a,/mnt/2000b,/mnt/2000c,/mnt/2000d,/run,/gfs -c 85 -x tmpfs,debugfs,tracefs",
      "subscribers": [
        "disk-cs1"
      ],
      "interval": 60,
      "occurrences": 3,
      "refresh": 300,
      "handlers": [ "default", "mailer" ],
      "contact": [ "userA" ]
    }
  }
}

UPDATE: The difference is whether 1 contact or multiple contacts are specified. From the documentation: "in a check definition, you can specify a contact or an array of contacts which should be notified by e-mail"

When an array of contacts is specified, it sends to the default address ONLY. Neither/none of the specified contacts.

When only 1 contact is specified, it sends to that 1 contact.

UPDATE 2: No that's not it either.

devney commented 7 years ago

This is probably related to PR https://github.com/sensu-plugins/sensu-plugins-mailer/pull/44

stevenviola commented 7 years ago

@matthewdevney It sounds like you're describing two different things here. If I understand correctly, the two issues are:

Regarding the e-mails always being sent to the mail_to address, that was unchanged with #44, and contacts defined in checks are additive to the address in the mail_to, so you should expect the mail_to address to receive all e-mails (unless there is a mail_to in the client config)

As for the issue sending e-mails to multiple clients, I don't see anything wrong with the JSON you posted, and the only thing I can think is maybe if you're running in a cluster, not all the plugins have been updated, which is why it's only sending some of the time. Of the e-mails that did get to userA, can you check if userB's e-mail is specified in the to: field?

The only other guess could be SNMP, as I don't use that. Maybe that's stripping things or something odd along those lines. I've been running the contact routing changes since November and all of my clients have two contacts listed, and I've never seen anything like what you're describing.

If you have debug logging turned on, can you post the logs with the message handling event and then the following line with the message handler extension output. Also what version of sensu-server are you running?

devney commented 7 years ago

Version: /opt/sensu/embedded/bin/ruby /opt/sensu/bin/sensu-server --version 0.29.0 However the installed package says dpkg -l | grep sensu ii sensu 0.28.5-2 amd64

No, emails are not always sent to the e-mail address in the mail_to field of the mailer config. In the 3 example emails that went to userA, they went ONLY to userA. Not to userB, not to default address. UserB's email is not in the to: field.
In later examples emails were sent to both userA and the default address. I have not yet found any correllation between changes made and these emails.
I am NOT running a cluster. Only a single server which runs sensu-server and all related services, and sends mail to an MTA on localhost.
NOT using snmp at all in any way/shape/form.

I have turned on debug logging and should be able to get that 'handling event' and 'handler extension output' to you early next week.

stevenviola commented 7 years ago

That is strange. Based on your configs you posted, the expected behavior would be the following e-mails should be sent for the associated alerts:

swapinout: sensu-alerts@domain.com, userA@domain.com, userB@domain.com disk-cs1: sensu-alerts@domain.com, userA@domain.com

The only thing different from my setup is I define the contacts in the client definition, not in the checks themselves, but the changes made iterate through both the check and client, so it shouldn't make a difference. It will be clear with the "handling event" line from the logs how the event is structured to see if there's any obvious issues with the parsing of the contacts, but if that were the issue, I would figure it would be an all or nothing type situation.

Looking forwarding to seeing the logs when you have them. Thanks

majormoses commented 7 years ago

@stevenviola as a maintainer running around in my free time doing what I can, I would to thank you for helping troubleshoot this.

devney commented 7 years ago

My logs do not have any lines containing the string 'handler extension output'. Logs go back 7 days. /etc/default/sensu contains 'LOG_LEVEL=debug'. Please advise how to troubleshoot further. Thanks.

stevenviola commented 7 years ago

Are there other lines with "level":"debug"? Maybe somehow debug logging wasn't enabled. The logs should be in /var/log/sensu/sensu-server.log.

majormoses commented 7 years ago

@matthewdevney when did you upgrade to 0.28/0.29 ? That snippet regarding versions is alarming, could we try isolating the issue on an older sensu setup? Can you replicate this in multiple environments?

Also can you run: apt-cache policy sensu and give us the output something seems very wrong here I am just not really sure what yet from what has been posted. I suppose its possible to install 0.28 through package management and then install the sensu gem with a different version. If that was true then the block below would be evaluated and would report the version of the gem installed? https://github.com/sensu/sensu/blob/v0.28.5/lib/sensu/constants.rb#L2 can you validate you don't have the sensu 0.29 gem installed with a 0.28 package installed? Try running this to validate /opt/sensu/embedded/bin/gem list | grep sensu

@cwjohnston this is where it pulls this version from right? https://github.com/sensu/sensu/blob/v0.28.5/lib/sensu/constants.rb#L4 if you have any thoughts on this I would appreciate it.

devney commented 7 years ago

That does appear to be the case! I have sensu-0.29.0 installed from gem and 0.28.5-2 installed from apt. I don't recall upgrading at all.

I will attempt to remove the apt version (0.28.5-2) and see if the problem persists.

sensu:
  Installed: 0.28.5-2
  Candidate: 0.29.0-11
  Version table:
     0.29.0-11 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
     0.29.0-10 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
     0.29.0-7 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
 *** 0.28.5-2 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
        100 /var/lib/dpkg/status
     0.28.4-1 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
     0.28.3-1 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages
     0.28.2-1 0
        500 https://sensu.global.ssl.fastly.net/apt/ jessie/main amd64 Packages

The version table goes on like that back to 0.20.0-1.

majormoses commented 7 years ago

@matthewdevney I'd recommend the other way around, remove the gem and if you want to do an upgrade (I don't reccomend it right now) then do it via apt.

devney commented 7 years ago

Well both apt and gem versions are now 0.29.0. When I removed the sensu gem sensu-server wouldn't start. I should mention that I don't have ruby installed systemwide; the sensu gem is installed in sensu (apt version)'s embedded ruby.

I will monitor for recurrence of the suspect behavior.

majormoses commented 7 years ago

do you still need help? I will close otherwise in a few days as I notice: https://github.com/matthewdevney no longer even exists so I doubt they will even get this message

devney commented 7 years ago

At this point only the default user configured in the handler definition is getting emails. Not userA, not userB. Neither of those addresses are in the to: field.

majormoses commented 7 years ago

@devney ah you are back I assume you changed your github username and it took a while to update this on your previous comments.

OK will leave this open for now but I don't see anything that I can do atm since it all looks good and I do not use this handler.

stevenviola commented 7 years ago

@devney were you able to get the logs from the debug output after sorting out the version issues. Also, can you confirm you have the right version of the mailer after you reinstalled. Command should be: /opt/sensu/embedded/bin/gem list sensu-plugins-mailer

devney commented 7 years ago

sensu-plugins-mailer (1.2.0) I am still not seeing string handler extension output in any logs. Please confirm the proper way to turn on debug logging. Thanks!

stevenviola commented 7 years ago

You need to set LOG_LEVEL=debug in /etc/default/sensu and then restart the sensu-server. Check /var/log/sensu-server.log that you're seeing debug log messages. Any debug logs for the swapinout and disk-cs1 checks will also be useful to make sure that the check definition is being processed properly by the server.

Also, there should be lines with the message saying config file applied changes upon boot. If you could provide the logline that has the contacts json, then we can also confirm that's being loaded in properly as well

devney commented 7 years ago

Confirmed LOG_LEVEL=debug is in /etc/default/sensu . Still not seeing 'handler extension output' in logs.

sensu-server.log.1.gz:{"timestamp":"2017-07-12T21:46:44.579439+0000","level":"warn","message":"config file applied changes","file":"/etc/sensu/conf.d/contacts.json","changes":{"contacts":[null,{"sensu-alerts":{"email":{"to":"sensu-alerts@domain"}},"devney":{"email":{"to":"devney@domain"}},"farshid":{"email":{"to":"farshid@domain"}}}]}}

I'm going to close this ticket because it just plain doesn't work and we're not making any progress troubleshooting. Need to find a different solution.

majormoses commented 7 years ago

@devney sorry to hear that, any chance you are coming to the sensu summit? I would like to try to do a live troubleshooting session with you. Unfortunately there is not much for us to go with given what is currently in the logs...

Seljuke commented 5 years ago

I just noticed in the sensu documentation here it writes "contacts" with plural "s" in it. So maybe it just a misspell.