vozlt / nginx-module-vts

Nginx virtual host traffic status module
BSD 2-Clause "Simplified" License
3.17k stars 456 forks source link

upstream status not correct #258

Closed FrancisAston closed 1 year ago

FrancisAston commented 1 year ago

nginx version: openresty/1.19.9.1 nginx-module-vts-0.2.1/

server alread down,but the status page still up

nginx_upstream_check_module page image

nginx-module-vts status page image

u5surf commented 1 year ago

@FrancisAston Hi, Thanks reporting!! Can you try once to change display format json? Because the State is changed by upstream_server_node.down, it changes in below code, but it might not to call when format is html.

https://github.com/vozlt/nginx-module-vts/blob/5b3812e7566f778b943a2953ab6737f20a695c57/src/ngx_http_vhost_traffic_status_display_json.c#L603-L616

I still don’t make sense obviously whether it is bug or not but it worth for diving deeply for this. I’m willing to investigate that.

FrancisAston commented 1 year ago

the 127.0.0.1:8540 is down , response json data: "down": false I'm not a developer,Please help. thank a lot. "ali-oss": [ { "server": "127.0.0.1:8540", "requestCounter": 0, "inBytes": 0, "outBytes": 0, "responses": { "1xx": 0, "2xx": 0, "3xx": 0, "4xx": 0, "5xx": 0 }, "requestMsecCounter": 0, "requestMsec": 0, "requestMsecs": { "times": [], "msecs": [] }, "requestBuckets": { "msecs": [], "counters": [] }, "responseMsecCounter": 0, "responseMsec": 0, "responseMsecs": { "times": [], "msecs": [] }, "responseBuckets": { "msecs": [], "counters": [] }, "weight": 1, "maxFails": 3, "failTimeout": 5, "backup": false, "down": false, "overCounts": { "maxIntegerSize": 18446744073709551615, "requestCounter": 0, "inBytes": 0, "outBytes": 0, "1xx": 0, "2xx": 0, "3xx": 0, "4xx": 0, "5xx": 0, "requestMsecCounter": 0, "responseMsecCounter": 0 } }],

u5surf commented 1 year ago

@FrancisAston Is this the below module which you use vts module together? https://github.com/yaoweibin/nginx_upstream_check_module And we appreciate if you can give us the nginx.conf which can reproduce this behavior.

u5surf commented 1 year ago

@FrancisAston I've investigated, I seem to found the solution of this issue. Can you check your nginx.conf whether upstream zone directive is active or not. e.g.

http {
...
    vhost_traffic_status_zone;
    upstream web {
            zone backend 64k; # <- here
            check interval=5000 rise=1 fall=3 timeout=4000;
            server 127.0.0.1:9000;
    }
    server {
        listen       8000;
        location / {
            proxy_pass http://web;
        }
    }
    server {
        listen       80;
        location /status/ {
            vhost_traffic_status_display;
            vhost_traffic_status_display_format json; #prometheus;
        }
    }
}

c.f. http://nginx.org/en/docs/http/ngx_http_upstream_module.html#zone

if the directive is inactive, the upstream alive status check which I referred the comment in #258 (comment) can't pass for the goto is exist before. https://github.com/vozlt/nginx-module-vts/blob/5b3812e7566f778b943a2953ab6737f20a695c57/src/ngx_http_vhost_traffic_status_display_json.c#L573-L575

As a test we activate the zone directive, the status down can return true in case of upstream is dead.

result

% curl -s http://localhost/status/ | jq '.upstreamZones.web[0].down'
true

gdb info:

Breakpoint 1, ngx_http_vhost_traffic_status_display_set_upstream_group (r=0x558bb9bb7180, buf=0x558bb9c094fe "") at ../ng
inx-module-vts/src/ngx_http_vhost_traffic_status_display_json.c:573
573                 if (uscf->shm_zone == NULL) {
(gdb) p uscf->shm_zone
$1 = (ngx_shm_zone_t *) 0x558bb9bc4320
(gdb) c
FrancisAston commented 1 year ago

http { ... vhost_traffic_status_zone; upstream web { zone backend 64k; # <- here check interval=5000 rise=1 fall=3 timeout=4000; server 127.0.0.1:9000; }

add zone backend 64k; and reload nginx. it work. thinks 。