Open kworkbee opened 9 months ago
Can you retrieve health check information ?
curl -i http://127.0.0.1:9090/v1/healthcheck
does this bug exist even if you don't use the ingress controller?
@hanqingwu The node that should be Unhealthy
is marked Healthy
.
Log shows below (Failed SSL Handshake):
2024/03/11 06:46:31 [error] 50#50: *4510567 [lua] healthcheck.lua:1383: log(): [healthcheck] (upstream#/apisix/upstreams/23eb23c7) failed SSL handshake with 'X.X.X.X (X.X.X.X:443)', using server name (sni) 'svc02.corp.com': 19: self-signed certificate in certificate chain, context: ngx.timer, client: X.X.X.X, server: 0.0.0.0:9080
@shreemaan-abhishek The same symptom appears even when the ingress controller is not deployed.
@kworkbee please share repro steps for apisix.
@shreemaan-abhishek I would like to apply it in the following form.
With Helm Chart, APISIX is installed in the tools cluster and ApisixRoute
/ ApisixUpstream
objects are deployed as written in the description above.
I want to configure it to route to 50:50 and when certain clusters fail, I want to adjust the weight to the rest of the cluster.
However, despite the Upstream Health Check setting, there is a problem that it is not possible to automatically exclude Upstream, which is currently 503.
The parts found in the APISIX Log are as follows.
2024/03/18 11:46:03 [error] 49#49: *14808 [lua] healthcheck.lua:1383: log(): [healthcheck] (upstream#/apisix/upstreams/32eb11c7) failed SSL handshake with 'X.X.X.X (X.X.X.X:443)', using server name (sni) 'svc01.corp.com': 19: self-signed certificate in certificate chain, context: ngx.timer, client: X.X.X.X, server: 0.0.0.0:9080
2024/03/18 11:46:06 [warn] 49#49: *14846 [lua] balancer.lua:82: fetch_health_nodes(): failed to get health check target status, addr: X.X.X.X:443, host: nil, err: target not found, client: X.X.X.X, server: _, request: "POST /feature-flags/flagd.evaluation.v1.Service/ResolveBoolean HTTP/1.1", host: "kubernetes.corp.com"
19: self-signed certificate in certificate chain
Does that matter?
@shreemaan-abhishek Can you please take a look?
Current Behavior
Same as apache/apisix-ingress-controller#2176.
There is a problem with the unhealthy external service being delivered as it is without being excluded from routing targets.
Expected Behavior
Two external services (ALB configured in front of each) are configured as upstream nodes and should be temporarily excluded from routing if a 5XX error occurs through health check configuration.
Error Logs
No response
Steps to Reproduce
For reproducing the issue, one service is deployed, and the other one is not deployed (only ALB's are set up.)
Environment
APISIX Ingress controller version (run apisix-ingress-controller version --long) Kubernetes cluster version (run kubectl version) OS version if running APISIX Ingress controller in a bare-metal environment (run uname -a) Runs on an AWS EKS Cluster (Kubernetes v1.25). Uses APISIX Helm Chart (1.11.0, App 3.8.0).