Purple-Devs / health_check

Simple health check of Rails app for use with uptime checking sites like newrelic and pingdom
MIT License
476 stars 125 forks source link

on_failure callbacks not working as expected #144

Open adrys-lab opened 6 months ago

adrys-lab commented 6 months ago

Hello,

we would like to have better knowledge in datadog logs, about when a Healthcheck fails and understanding what is the reason of such failure.

For that, we have added a new on_failure callback that logs as error the failures, but we don´t see that in our logs.

Additionally, we only see logs for health_check failed configured in config.failure 🤔 And even though we have configured the failures as :error we still see health_check failed as info level 🤔

image

Would you mind to help us find the root cause ??

gems:

health_check (3.1.0)
ruby (3.1.4)
puma (6.3.1)
rails (7.1.3)

healh_check initializer:

# frozen_string_literal: true

require File.expand_path('../../app/lib/redis_client_factory', __dir__)

HealthCheck.setup do |config|
  config.redis_url = RedisClientFactory::REDIS_URL

  config.success = 'Site checked'
  config.failure = 'health_check failed'

  config.include_error_in_response_body = true
  config.log_level = :error

  config.http_status_for_error_text = 503   # Service Unavailable
  config.http_status_for_error_object = 503 # Service Unavailable

  config.standard_checks = %w(
    database
    migrations
    cache
    sidekiq-redis
    shutting-down
    elasticsearch
    site
  )

  config.full_checks = %w(
    database
    migrations
    cache
    sidekiq-redis
    shutting-down
    elasticsearch
    site
  )

  config.add_custom_check('shutting-down') do
    PumaClusterOverride::WorkerStatus.shutting_down? ? 'Shutting down' : ''
  end

  config.on_success do
    trace = Datadog::Tracing.active_trace
    trace.reject!
  end

  config.on_failure do |checks, msg|
    Rails.logger.error do
      {
        event: "healthcheck",
        message: "Healthcheck Failure",
        code: 503,
        args: {
          message: "Failed to perform health checks",
          healthcheck_message: msg,
          checks: checks
        },
        domain: 'web_internals'
      }
    end
  end
end
ArmandoAssuncao commented 4 months ago

I open a PR fixing: https://github.com/Purple-Devs/health_check/pull/146