sensu / sensu-chef

Sensu Chef cookbook.
https://supermarket.chef.io/cookbooks/sensu
Apache License 2.0
222 stars 280 forks source link

Question: How to add check filtration helper methods/property in sensu_check resource. #545

Closed mayankgtal closed 7 years ago

mayankgtal commented 7 years ago

I'm curious to know, how different people are using sensu_check resource to configure different checks on different kind of applicable servers.

Scenario: As we are having 200 checks defined in recipe. When recipe run, it configures all checks on every server where recipe is ran. In order to prevent I have created helper library which response whether current resource/checks should be created on that server or not. To achieve this library uses hash of {[check_name_lists] , [applicable_servers_lists]} and host_name of current server where recipe is running.

    sensu_check "apache_recovery" do
      command "sh /tmp/apache_recovery.sh"
      handlers ["apache"]
      subscribers ["#{node['wrapper_cookbook_name']['instance']["hostname"]}"]
      standalone false
      publish false
      action :create
    end

Hash of applicable checks

default['cookbook_name']['checks'] = [
    {
        'checks' =>['apache','apache-recovery'],
        'applicable_servers' => ['httpd','apache-server'],
        'short_name' => 'httpd',
        'active' => true
    },
       {}
]

Issue: Library is unable to respond correctly because I'm not able to get current executing resource_name i.e. apache_recovery from sensu_check resourse. Once I'm able to get resource_name property in library function for all defined sensu_check resource, I can easily determine whether current check is applicable to that server or not during chef-client run.

Please let me know, how other community members are using checks management with this cool community cookbook within their wrapper cookbook.

Evesy commented 7 years ago

We have our sensu config split out into a couple of different cookbooks. (For reference we only use subscription checks, no standalones).

Sensu server cookbook is ran on the sensu server nodes -- This configures the sensu server/api and its relevant config, all check definitions, handlers, extensions etc.

We then have the client cookbook which is pulled in to every single servers run list. This configures the sensu client and its relevant config, and installs all plugin gems, as well as any custom scripts we have too.

The decision was made to roll out all the plugins to every server (e.g. a redis server will still have elasticsearch gems) for easier maintainability. It also gives some confidence that anyone who wants to use an existing check will already have the required dependencies on their client.

Our current strategy probably has many flaws though, and it does not account for standalone checks or subscription checks using safe mode.

mayankgtal commented 7 years ago

@Evesy , Thanks for sharing implementation strategy. From above this I'm sure that you might be having plenty of config file templates dedicated to different types of servers and there are chances of keeping same check (if most of server require this check) on many config file templates which leads to complication when small change is required in that check configuration.

In order to simplify such implementation complication, I'm trying to inject one property (valid in below code) in sensu_check resource by overriding its Resources and Providers.

sensu_check "apache_recovery" do
      command "sh /tmp/apache_recovery.sh"
      handlers ["apache"]
      subscribers ["#{node['wrapper_cookbook_name']['instance']["hostname"]}"]
      standalone false
      publish false
   valid isCheckValid?
      action :create
end

Above check would be created only if isCheckValid returns true. Check validation logic would be abstracted in either Libraries or Providers. As we are good to gather resource state information in resource_collection from provider, we can implement this important functionality.

I believe default filtration logic can be implemented in upstream sensu cookbook and other users should be good to override this method with their check filtration logic.

majormoses commented 7 years ago

I found that rather than having just mycookbook['checks'] as a single object with all your checks we decided to have each check bve defined something like this:

default['cc']['sensu']['checks']['load']['command'] = 'check-load.rb -p -w 2.0,2.5,2.75 -c 3.5,3.25,3.0'
default['cc']['sensu']['checks']['load']['handlers'] = ['pagerduty']
default['cc']['sensu']['checks']['load']['subscribers'] = ['base']
default['cc']['sensu']['checks']['load']['interval'] = 15
default['cc']['sensu']['checks']['load']['additional'] = {
  'pager_team' => 'non_urgent',
  'notification' => 'Load is unusually high. Load number shown is per core.',
  'occurrences' => 24,
  'subdue' => {
    'days' => {
      'all' => [
        {
          'begin' => '6PM PST',
          'end' => '10AM PST'
        }
      ]
    }
  }
}

And then our recipe for generating the checks looks something like this:

node['cc']['sensu']['checks'].each do |name, check|
  sensu_check name do
    if check['command']
      if node['cc']['sensu']['checks'][name]['custom']
        command "#{node['sensu']['directory']}/plugins/" + check['command']
      else
        command check['command']
      end
    else
      Chef::Log.error "Unable to find a command for check: #{name}"
    end
    if check['handlers']
      handlers check['handlers']
    else
      Chef::Log.info "Warning there are no handlers defined for check: #{name}"
    end
    if check['subscribers']
      subscribers check['subscribers']
    else
      Chef::Log.error "Unable to find a subscriber for check: #{name}"
    end
    if check['interval']
      interval check['interval']
    else
      Chef::Log.info "No defined interval for check: #{name}, using default of #{node['cc']['sensu']['check_interval']}"
      interval node['cc']['sensu']['check_interval']
    end

    publish check['publish'] unless check['publish'].nil?

    if check['additional']
      additional check['additional']
    else
      Chef::Log.error "No defined additional for check: #{name}, this includes occurrences, notification message, etc."
    end
    # use unless nil instead of if because its a bool value
    standalone check['standalone'] unless check['standalone'].nil?
    if check['low_flap_threshold'] && check['high_flap_threshold']
      if check['high_flap_threshold'] > check['low_flap_threshold']
        low_flap_threshold check['low_flap_threshold']
        high_flap_threshold check['high_flap_threshold']
      end
    end

    only_if do
      # Generate all checks on sensu server, base checks, or checks that are in the intersection of roles and subscribers.
      (node['roles'].include? 'sensu_server')\
      || (check['subscribers'].include? 'base')\
      || (!node['cc']['sensu']['roles'].nil?\
      && !(check['subscribers'] & node['cc']['sensu']['roles']).empty?)
    end
  end
end

One of the Benefits of this is that you can more easily override things per env (env cookbooks, roles, environments, etc). Also this approach means that not all checks get installed on machines and only if they are meant to be an subscriber which saves time during convergence.

mayankgtal commented 7 years ago

@majormoses This is really stable and flexible attribute driven implementation. Developers Just need to add appropriate check attributes and rest all would be configured with check's conditional statements.

FYI, My Implementation strategy is just based on one Guard condition. i.e.

class Chef::Recipe::SensuHelper
    class << self
        def isCheckValid?(resource=nil)
            Chef::Application.fatal!("No resource name parameter is passed during function call") unless resource.nil?
            rn = getCurrentResourceName(resource)
            get_allowed_checks_on_current_server.include?("#{rn}")
        end
    end
end
    sensu_check "apache_recovery" do |check|
      command "sh /tmp/apache_recovery.sh"
      handlers ["apache"]
      subscribers ["#{node['wrapper_cookbook_name']['instance']["hostname"]}"]
      standalone false
      publish false
      action :create
      only_if {SensuHelper.isCheckValid?("#{check}")}
    end

Not much wrapper just abstracted helper methods. Definitely there are few attributes which needs to be override for different env, roles and they are taken care by usual way.