Closed psrin7 closed 6 years ago
Envoy expects a 200
status code in the response for the healthcheck to succeed. See Architecture Overview: Healthchecking.
Your example curl is returning 301
.
I have also tested with health check returning 200. It made no difference.
On Thu, Jul 19, 2018, 4:19 PM Daniel Hochman notifications@github.com wrote:
Envoy expects a 200 status code in the response for the healthcheck to succeed. See Architecture Overview: Healthchecking https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/health_checking#arch-overview-health-checking .
Your example curl is returning 301.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/envoyproxy/envoy/issues/3908#issuecomment-406442714, or mute the thread https://github.com/notifications/unsubscribe-auth/AQ47GrhKofm-qY84iEOBSee9mETOA1daks5uIRP_gaJpZM4VXG1O .
What intrigues me is that the health check event shows address as 0.0.0.0 and port 0. It isn't using what is defined in the cluster's socket_address. It also indicates failure_type as NETWORK, which isn't helpful much. Thx.
Here is the health check event log. {"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"0.0.0.0","resolver_name":"","ipv4_compat":false,"port_value":0}},"cluster_name":"sample-cluster","eject_unhealthy_event":{"failure_type":"NETWORK"}}
@psrin7 ,
can you refer this config and try. And hook this to a service which has /health URI
node: id: nodexxx cluster: dc1 admin: access_log_path: /tmp/admin_access.log address: socket_address: { address: 0.0.0.0, port_value: 8001 }
static_resources: listeners:
@vgomprakash - Thanks for you inputs. The above configuration works. The caveat is that the type (cluster.discoverytype) is STATIC and you need to provide an IP address. But, I have a need to use DNS. Since, I am using DNS for the hosts address, I have useSTRICT_DNS or LOGICAL_DNS. When I change the type to STRICT_DNS, the active health checking is working. But, it isn't working with LOGICAL_DNS as the type.
I can confirm that switching from logical_dns to strict_dns resulted in my active health checks working.
The behaviour I observed is that the envoy proxy is not sending packets at all when set to logical_dns. I suspect this may be because it resolves the hosts as '0.0.0.0', as this is what shows up in the stats endpoint on the envoy admin ui, plus in the debug logs. This may be a symptom of what's going on.
Active healthcheck for cluster is failing
Active healthcheck is failing though I am getting response from the upstream. Not sure if I am missing anything in the configuration.
Admin and Stats Output: cluster.sample-cluster.bind_errors: 0 cluster.sample-cluster.external.upstream_rq_200: 11 cluster.sample-cluster.external.upstream_rq_2xx: 11 cluster.sample-cluster.external.upstream_rq_301: 1 cluster.sample-cluster.external.upstream_rq_3xx: 1 cluster.sample-cluster.health_check.attempt: 12 cluster.sample-cluster.health_check.failure: 12 cluster.sample-cluster.health_check.healthy: 0 cluster.sample-cluster.health_check.network_failure: 12 cluster.sample-cluster.health_check.passive_failure: 0 cluster.sample-cluster.health_check.success: 0 cluster.sample-cluster.health_check.verify_cluster: 0 cluster.sample-cluster.lb_healthy_panic: 12 cluster.sample-cluster.lb_local_cluster_not_ok: 0 cluster.sample-cluster.lb_recalculate_zone_structures: 0 cluster.sample-cluster.lb_subsets_active: 0 cluster.sample-cluster.lb_subsets_created: 0 cluster.sample-cluster.lb_subsets_fallback: 0 cluster.sample-cluster.lb_subsets_removed: 0 cluster.sample-cluster.lb_subsets_selected: 0 cluster.sample-cluster.lb_zone_cluster_too_small: 0 cluster.sample-cluster.lb_zone_no_capacity_left: 0 cluster.sample-cluster.lb_zone_number_differs: 0 cluster.sample-cluster.lb_zone_routing_all_directly: 0 cluster.sample-cluster.lb_zone_routing_cross_zone: 0 cluster.sample-cluster.lb_zone_routing_sampled: 0 cluster.sample-cluster.max_host_weight: 0 cluster.sample-cluster.membership_change: 1 cluster.sample-cluster.membership_healthy: 0 cluster.sample-cluster.membership_total: 1
Config:
Logs: $ [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:183] initializing epoch 0 (hot restart version=10.200.16384.127.options=capacity=16384, num_slots=8209 hash=228984379728933363 size=2654312) [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:185] statically linked extensions: [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:187] access_loggers: envoy.file_access_log,envoy.http_grpc_access_log [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:190] filters.http: envoy.buffer,envoy.cors,envoy.ext_authz,envoy.fault,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.rbac,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:193] filters.listener: envoy.listener.original_dst,envoy.listener.proxy_protocol,envoy.listener.tls_inspector [2018-07-19 20:54:09.457][19][info][main] source/server/server.cc:196] filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.thrift_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy [2018-07-19 20:54:09.458][19][info][main] source/server/server.cc:198] stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.stat_sinks.hystrix,envoy.statsd [2018-07-19 20:54:09.458][19][info][main] source/server/server.cc:200] tracers: envoy.dynamic.ot,envoy.lightstep,envoy.zipkin [2018-07-19 20:54:09.458][19][info][main] source/server/server.cc:203] transport_sockets.downstream: envoy.transport_sockets.capture,raw_buffer,tls [2018-07-19 20:54:09.458][19][info][main] source/server/server.cc:206] transport_sockets.upstream: envoy.transport_sockets.capture,raw_buffer,tls [2018-07-19 20:54:09.463][19][debug][main] source/server/server.cc:234] admin address: 0.0.0.0:9901 [2018-07-19 20:54:09.464][19][info][config] source/server/configuration_impl.cc:50] loading 0 static secret(s) [2018-07-19 20:54:09.465][22][debug][grpc] source/common/grpc/google_async_client_impl.cc:39] completionThread running [2018-07-19 20:54:09.466][19][debug][upstream] source/common/upstream/cluster_manager_impl.cc:707] adding TLS initial cluster sample-cluster [2018-07-19 20:54:09.466][19][debug][upstream] source/common/upstream/logical_dns_cluster.cc:70] starting async DNS resolution for some-domain.com [2018-07-19 20:54:09.466][19][debug][upstream] source/common/network/dns_impl.cc:147] Setting DNS resolution timer for 5000 milliseconds [2018-07-19 20:54:09.466][19][debug][upstream] source/common/upstream/cluster_manager_impl.cc:61] cm init: adding: cluster=sample-cluster primary=1 secondary=0 [2018-07-19 20:54:09.466][19][info][config] source/server/configuration_impl.cc:60] loading 1 listener(s) [2018-07-19 20:54:09.466][19][debug][config] source/server/configuration_impl.cc:62] listener #0: [2018-07-19 20:54:09.466][19][debug][config] source/server/listener_manager_impl.cc:528] begin add/update listener: name=listener_0 hash=16491985507912357005 [2018-07-19 20:54:09.466][19][debug][config] source/server/listener_manager_impl.cc:38] filter #0: [2018-07-19 20:54:09.466][19][debug][config] source/server/listener_manager_impl.cc:39] name: envoy.http_connection_manager [2018-07-19 20:54:09.466][19][debug][config] source/server/listener_manager_impl.cc:42] config: {"http_filters":[{"name":"envoy.router"}],"route_config":{"virtual_hosts":[{"name":"local_service","domains":["*"],"routes":[{"match":{"prefix":"/"},"route":{"cluster":"sample-cluster"}}]}],"name":"local_route"},"stat_prefix":"ingress_http","codec_type":null} [2018-07-19 20:54:09.468][19][debug][config] source/extensions/filters/network/http_connection_manager/config.cc:279] http filter #0 [2018-07-19 20:54:09.468][19][debug][config] source/extensions/filters/network/http_connection_manager/config.cc:280] name: envoy.router [2018-07-19 20:54:09.468][19][debug][config] source/extensions/filters/network/http_connection_manager/config.cc:284] config: {} [2018-07-19 20:54:09.468][19][debug][config] source/server/listener_manager_impl.cc:414] add active listener: name=listener_0, hash=16491985507912357005, address=0.0.0.0:10000 [2018-07-19 20:54:09.468][19][info][config] source/server/configuration_impl.cc:94] loading tracing configuration [2018-07-19 20:54:09.468][19][info][config] source/server/configuration_impl.cc:116] loading stats sink configuration [2018-07-19 20:54:09.468][19][info][main] source/server/server.cc:410] starting main dispatch loop [2018-07-19 20:54:09.468][19][debug][upstream] source/common/upstream/logical_dns_cluster.cc:78] async DNS resolution complete for some-domain.com [2018-07-19 20:54:09.468][19][debug][client] source/common/http/codec_client.cc:25] [C0] connecting [2018-07-19 20:54:09.468][19][debug][connection] source/common/network/connection_impl.cc:570] [C0] connecting to 0.0.0.0:0 [2018-07-19 20:54:09.468][19][debug][connection] source/common/network/connection_impl.cc:579] [C0] connection in progress [2018-07-19 20:54:09.472][19][debug][connection] source/common/network/connection_impl.cc:475] [C0] delayed connection error: 111 [2018-07-19 20:54:09.472][19][debug][connection] source/common/network/connection_impl.cc:133] [C0] closing socket: 0 [2018-07-19 20:54:09.472][19][debug][client] source/common/http/codec_client.cc:81] [C0] disconnect. resetting 1 pending requests [2018-07-19 20:54:09.472][19][debug][client] source/common/http/codec_client.cc:104] [C0] request reset [2018-07-19 20:54:09.472][19][debug][hc] source/common/upstream/health_checker_impl.cc:170] [C0] connection/stream error health_flags=healthy [2018-07-19 20:54:09.472][19][debug][upstream] source/common/upstream/cluster_manager_impl.cc:844] membership update for TLS cluster sample-cluster {"health_checker_type":"HTTP","host":{"socket_address":{"protocol":"TCP","address":"0.0.0.0","resolver_name":"","ipv4_compat":false,"port_value":0}},"cluster_name":"sample-cluster","eject_unhealthy_event":{"failure_type":"NETWORK"}} [2018-07-19 20:54:09.472][19][debug][upstream] source/common/upstream/cluster_manager_impl.cc:89] cm init: init complete: cluster=sample-cluster primary=0 secondary=0
curl -v http://some-domain.com:80
About to connect() to some-domain.com port 80 (#0)
Trying 148.x.x.x...
Connected to some-domain.com (148.x.x.x) port 80 (#0)
Connection #0 to host some-domain.com left intact