openresty / lua-resty-upstream-healthcheck

Health Checker for Nginx Upstream Servers in Pure Lua
515 stars 134 forks source link

Log messages for multiple upstreams with same servers, but different hostnames #70

Open ecc256 opened 4 years ago

ecc256 commented 4 years ago

Guys, I’ve followed module recommendation for multiple upstreams. Error log has messages like:

[error] 193#193: 118495174 [lua] healthcheck.lua:53: errlog(): healthcheck: failed to receive status line from 10.0.0.1:80: timeout, context: ngx.timer [error] 190#190: 118495179 [lua] healthcheck.lua:53: errlog(): healthcheck: failed to receive status line from 10.0.0.5:80: timeout, context: ngx.timer

How do I tell, which upstream they belong too?

Full setup described here, snippet is below:

upstream one.abc.com_80 {
                server 10.0.0.1:80;
                server 10.0.0.2:80;
                ...
                server 10.0.0.8:80;
}

upstream two.abc.com_80 {
                server 10.0.0.1:80;
                server 10.0.0.2:80;
                ...
                server 10.0.0.8:80;
}

init_worker.lua

local servers = { 
    "one.abc.com", "two.abc.com", ...
}

local hc = require "resty.upstream.healthcheck"

local function checker(upstream, server_name)
    local ok, err = hc.spawn_checker{
        shm = "healthcheck",  -- defined by "lua_shared_dict"
        upstream = upstream,
        type = "http",

        http_req = "GET /HealthCheck/Health.ashx HTTP/1.0\r\nHost: " .. server_name .. "\r\n\r\n", -- raw HTTP request for checking
        interval = 2000, -- run the check cycle every 2 sec
        timeout = 1000, -- 1 sec is the timeout for network operations
        fall = 3, -- # of successive failures before turning a peer down
        rise = 2, -- # of successive successes before turning a peer up
        valid_statuses = {200}, -- a list valid HTTP status code
        concurrency = 10, -- concurrency level for test requests
    }

    if not ok then
        ngx.log(ngx.ERR, "failed to spawn health checker: ", err)
    end

    return ok
end

local function main()
    for _, server in ipairs(servers) do
        checker(server .. "_80", server)
    end
end

main()
ecc256 commented 4 years ago

And a follow up question:

[warn] 188#188: *118566393 [lua] healthcheck.lua:49: warn(): healthcheck: peer 10.0.0.5:80 is turned down after 3 failure(s), context: ngx.timer

What does the message mean exactly? The 10.0.0.5:80 is turned down for all upstreams or for the single upstream, it failed to get the healthcheck for?

spacewander commented 4 years ago

The failure is counted by key, which is created by gen_peer_key with peer.name plus upstream plus peer type and more. So I think the turn down is for the single upstream.

The error message doesn't contain the upstream info. Can you submit a PR to improve it?

ecc256 commented 4 years ago

So I think the turn down is for the single upstream.

Yep, it’s for the single upstream, status_page() confirms it.

local hc = require "resty.upstream.healthcheck"
ngx.print(hc.status_page())

The error message doesn't contain the upstream info. Can you submit a PR to improve it?

Yep, just did