hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.36k stars 4.42k forks source link

Ingress-gateway CrashLoopBackOff on K8S cluster with existing proxy-defaults #8549

Closed tejnar closed 4 years ago

tejnar commented 4 years ago

When filing a bug, please include the following headings if possible. Any example text in this template can be deleted.

Overview of the Issue

I have consul client running on K8S cluster and Consul Server running outside of K8S. Already have an existing proxy-defaults in consul configuration. When an ingress-gateway pod is dead or moved to a different node, it went to CrashLoopBackOff.

I've to delete the proxy-defaults and service-router configuration to make ingress-gateway work.

Reproduction Steps

Steps to reproduce this issue, eg:

  1. Have Consul Server outside of K8S.
  2. Install consul-client on K8S with connect enabled.
  3. Create consul proxy-default configuration File : proxy-defaults.json { "Kind": "proxy-defaults", "Name": "global", "Config": { "envoy_prometheus_bind_addr": "0.0.0.0:1999" }, "MeshGateway": {}, "Expose": {}, "CreateIndex": 92272683, "ModifyIndex": 92272683 }

consul config write proxy-defaults.json(above file)

  1. Delete consul-ingress-gateway pod in K8S cluster.
  2. kubectl get pods consul-consul-adh-acp-ingress-gateway-76f9b49dcc-hcjlk 1/2 CrashLoopBackOff 4 2m59s

Consul info for both Client and Server

Server: Consul v1.8.0

$ consul info agent: check_monitors = 2 check_ttls = 0 checks = 4 services = 3 build: prerelease = revision = 3111cb8c version = 1.8.0 consul: acl = disabled bootstrap = false known_datacenters = 1 leader = false leader_addr = XX.XX.XX.XX:8300 server = true raft: applied_index = 93606715 commit_index = 93606715 fsm_pending = 0 last_contact = 62.466122ms last_log_index = 93606715 last_log_term = 135 last_snapshot_index = 93592126 last_snapshot_term = 135 latest_configuration = [{Suffrage:Voter ID:f5525b44-080f-460b-7328-6ff89825a44b Address:XX.XX.XX.XX:8300} {Suffrage:Voter ID:a817ba56-b909-0208-bbb9-1e2a6e1c0143 Address:XX.XX.XX.XX:8300} {Suffrage:Voter ID:8ec6f40d-c8a1-3c38-89a9-0c377047dee7 Address:XX.XX.XX.XX:8300}] latest_configuration_index = 0 num_peers = 2 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Follower term = 135 runtime: arch = amd64 cpu_count = 8 goroutines = 1046 max_procs = 8 os = linux version = go1.14.4 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 62 failed = 9 health_score = 0 intent_queue = 0 left = 0 member_time = 6374 members = 228 query_queue = 0 query_time = 19 serf_wan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 65 members = 3 query_queue = 0 query_time = 1

Client: $ consul info agent: check_monitors = 0 check_ttls = 0 checks = 13 services = 11 build: prerelease = revision = 3111cb8c version = 1.8.0 consul: acl = disabled known_servers = 3 server = false runtime: arch = amd64 cpu_count = 4 goroutines = 2650 max_procs = 4 os = linux version = go1.14.4 serf_lan: coordinate_resets = 0 encrypted = true event_queue = 0 event_time = 62 failed = 9 health_score = 0 intent_queue = 0 left = 0 member_time = 6374 members = 228 query_queue = 0 query_time = 19

K8S

kubectl version

Client Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-13T11:23:11Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"16", GitVersion:"v1.16.3", GitCommit:"b3cbbae08ec52a7fc73d334838e18d17e8512749", GitTreeState:"clean", BuildDate:"2019-11-13T11:13:49Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}

Log Fragments

Pod Logs: kubectl -n monitoring logs -f consul-consul-adh-acp-ingress-gateway-76f9b49dcc-hcjlk -c ingress-gateway [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:255] initializing epoch 0 (hot restart version=disabled) [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:257] statically linked extensions: [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.filters.udp_listener: envoy.filters.udp.dns_filter, envoy.filters.udp_listener.udp_proxy [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.filters.network: envoy.client_ssl_auth, envoy.echo, envoy.ext_authz, envoy.filters.network.client_ssl_auth, envoy.filters.network.direct_response, envoy.filters.network.dubbo_proxy, envoy.filters.network.echo, envoy.filters.network.ext_authz, envoy.filters.network.http_connection_manager, envoy.filters.network.kafka_broker, envoy.filters.network.local_ratelimit, envoy.filters.network.mongo_proxy, envoy.filters.network.mysql_proxy, envoy.filters.network.ratelimit, envoy.filters.network.rbac, envoy.filters.network.redis_proxy, envoy.filters.network.sni_cluster, envoy.filters.network.tcp_proxy, envoy.filters.network.thrift_proxy, envoy.filters.network.zookeeper_proxy, envoy.http_connection_manager, envoy.mongo_proxy, envoy.ratelimit, envoy.redis_proxy, envoy.tcp_proxy [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.thrift_proxy.filters: envoy.filters.thrift.rate_limit, envoy.filters.thrift.router [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.transport_sockets.downstream: envoy.transport_sockets.alts, envoy.transport_sockets.raw_buffer, envoy.transport_sockets.tap, envoy.transport_sockets.tls, raw_buffer, tls [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.stats_sinks: envoy.dog_statsd, envoy.metrics_service, envoy.stat_sinks.dog_statsd, envoy.stat_sinks.hystrix, envoy.stat_sinks.metrics_service, envoy.stat_sinks.statsd, envoy.statsd [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.tracers: envoy.dynamic.ot, envoy.lightstep, envoy.tracers.datadog, envoy.tracers.dynamic_ot, envoy.tracers.lightstep, envoy.tracers.opencensus, envoy.tracers.xray, envoy.tracers.zipkin, envoy.zipkin [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.filters.http: envoy.buffer, envoy.cors, envoy.csrf, envoy.ext_authz, envoy.fault, envoy.filters.http.adaptive_concurrency, envoy.filters.http.aws_lambda, envoy.filters.http.aws_request_signing, envoy.filters.http.buffer, envoy.filters.http.cache, envoy.filters.http.cors, envoy.filters.http.csrf, envoy.filters.http.dynamic_forward_proxy, envoy.filters.http.dynamo, envoy.filters.http.ext_authz, envoy.filters.http.fault, envoy.filters.http.grpc_http1_bridge, envoy.filters.http.grpc_http1_reverse_bridge, envoy.filters.http.grpc_json_transcoder, envoy.filters.http.grpc_stats, envoy.filters.http.grpc_web, envoy.filters.http.gzip, envoy.filters.http.header_to_metadata, envoy.filters.http.health_check, envoy.filters.http.ip_tagging, envoy.filters.http.jwt_authn, envoy.filters.http.lua, envoy.filters.http.on_demand, envoy.filters.http.original_src, envoy.filters.http.ratelimit, envoy.filters.http.rbac, envoy.filters.http.router, envoy.filters.http.squash, envoy.filters.http.tap, envoy.grpc_http1_bridge, envoy.grpc_json_transcoder, envoy.grpc_web, envoy.gzip, envoy.health_check, envoy.http_dynamo_filter, envoy.ip_tagging, envoy.lua, envoy.rate_limit, envoy.router, envoy.squash [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.udp_listeners: raw_udp_listener [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.retry_priorities: envoy.retry_priorities.previous_priorities [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] http_cache_factory: envoy.extensions.http.cache.simple [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.access_loggers: envoy.access_loggers.file, envoy.access_loggers.http_grpc, envoy.access_loggers.tcp_grpc, envoy.file_access_log, envoy.http_grpc_access_log, envoy.tcp_grpc_access_log [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.clusters: envoy.cluster.eds, envoy.cluster.logical_dns, envoy.cluster.original_dst, envoy.cluster.static, envoy.cluster.strict_dns, envoy.clusters.aggregate, envoy.clusters.dynamic_forward_proxy, envoy.clusters.redis [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.dubbo_proxy.filters: envoy.filters.dubbo.router [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.transport_sockets.upstream: envoy.transport_sockets.alts, envoy.transport_sockets.raw_buffer, envoy.transport_sockets.tap, envoy.transport_sockets.tls, raw_buffer, tls [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.dubbo_proxy.route_matchers: default [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.dubbo_proxy.serializers: dubbo.hessian2 [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.health_checkers: envoy.health_checkers.redis [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.thrift_proxy.protocols: auto, binary, binary/non-strict, compact, twitter [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.dubbo_proxy.protocols: dubbo [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.grpc_credentials: envoy.grpc_credentials.aws_iam, envoy.grpc_credentials.default, envoy.grpc_credentials.file_based_metadata [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.thrift_proxy.transports: auto, framed, header, unframed [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.resource_monitors: envoy.resource_monitors.fixed_heap, envoy.resource_monitors.injected_resource [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.filters.listener: envoy.filters.listener.http_inspector, envoy.filters.listener.original_dst, envoy.filters.listener.original_src, envoy.filters.listener.proxy_protocol, envoy.filters.listener.tls_inspector, envoy.listener.http_inspector, envoy.listener.original_dst, envoy.listener.original_src, envoy.listener.proxy_protocol, envoy.listener.tls_inspector [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.resolvers: envoy.ip [2020-08-23 01:14:40.213][1][info][main] [source/server/server.cc:259] envoy.retry_host_predicates: envoy.retry_host_predicates.omit_canary_hosts, envoy.retry_host_predicates.omit_host_metadata, envoy.retry_host_predicates.previous_hosts [2020-08-23 01:14:40.714][1][warning][misc] [source/common/protobuf/utility.cc:198] Using deprecated option 'envoy.api.v2.listener.Filter.config' from file listener_components.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. [2020-08-23 01:14:40.714][1][warning][misc] [source/common/protobuf/utility.cc:198] Using deprecated option 'envoy.api.v2.listener.Filter.config' from file listener_components.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. [2020-08-23 01:14:40.714][1][warning][misc] [source/common/protobuf/utility.cc:198] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cluster.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. [2020-08-23 01:14:40.714][1][warning][misc] [source/common/protobuf/utility.cc:198] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cluster.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. [2020-08-23 01:14:40.714][1][warning][misc] [source/common/protobuf/utility.cc:198] Using deprecated option 'envoy.api.v2.Cluster.hosts' from file cluster.proto. This configuration will be removed from Envoy soon. Please see https://www.envoyproxy.io/docs/envoy/latest/intro/deprecated for details. [2020-08-23 01:14:40.714][1][info][main] [source/server/server.cc:340] admin address: 127.0.0.1:19000 [2020-08-23 01:14:40.715][1][info][main] [source/server/server.cc:459] runtime: layers:

LifeCycle-sidecar Log: kubectl -n monitoring logs -f consul-consul-adh-acp-ingress-gateway-76f9b49dcc-hcjlk -c lifecycle-sidecar 2020-08-23T01:14:16.711Z [INFO] Command configuration: service-config=/consul/service/service.hcl consul-binary=/consul-bin/consul sync-period=10s log-level=info 2020-08-23T01:14:22.013Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:14:37.410Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:14:51.813Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:15:07.012Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:15:22.210Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:15:37.510Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway" 2020-08-23T01:15:52.610Z [INFO] successfully synced service: output="Registered service: adh-acp-ingress-gateway"

blake commented 4 years ago

Hi @tejnar,

This bug was fixed in Consul 1.8.1 with https://github.com/hashicorp/consul/pull/8371.

I recommend upgrading to the most recent Consul release, 1.8.3, which contains additional bug fixes.

tejnar commented 4 years ago

Thanks Blake, for the quick reply.