hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.31k stars 4.42k forks source link

Consul Connect: Sidecar not registering in envoy #6275

Closed xsreality closed 5 years ago

xsreality commented 5 years ago

Issue Description

I am trying to follow the guide for Consul Connect with Envoy and Docker (https://learn.hashicorp.com/consul/developer-mesh/connect-envoy) but in a modified way to make the services run on loopback interface.

My setup is using a single EC2 instance with Ubuntu 18. I created a dummy interface on IP 169.254.1.1 that looks as below:

dummy0: flags=195<UP,BROADCAST,RUNNING,NOARP>  mtu 1500
        inet 169.254.1.1  netmask 255.255.255.255  broadcast 0.0.0.0
        inet6 fe80::9c85:faff:fe93:a5d4  prefixlen 64  scopeid 0x20<link>
        ether 9e:85:fa:93:a5:d4  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 106  bytes 7420 (7.4 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

I have setup dnsmasq to forward queries on 169.254.1.1 port 53 to point to Consul DNS.

cat /etc/dnsmasq.d/consul.conf
server=/consul/169.254.1.1#8600
listen-address=127.0.0.1
listen-address=169.254.1.1

I start single consul server docker container with below command:

docker run --name=consul-server --restart=always -h broker1.consul.demo --net=host -v /home/ubuntu/consul-demo/envoy_demo.hcl:/etc/consul/envoy_demo.hcl consul:1.5.3 agent -server -config-file /etc/consul/envoy_demo.hcl -grpc-port 8502 -client 169.254.1.1 -bind 172.31.9.39 -advertise 172.31.9.39 -bootstrap-expect 1 -ui -log-level debug

The envoy_demo.hcl looks as below:

services {
  name = "client"
  port = 8080
  connect {
    sidecar_service {
      proxy {
        local_service_address = "169.254.1.1"
        upstreams {
          destination_name = "echo"
          local_bind_port = 9191
        }
      }
    }
  }
}
services {
  name = "echo"
  port = 9090
  connect {
    sidecar_service {
      proxy {
        local_service_address = "169.254.1.1"
      }
    }
  }
}

The echo service is started with below docker command. Note the service port 9090 is bound to dummy IP, this is the desired goal to keep the setup secure:

docker run --rm -d --dns 169.254.1.1 --name echo-service --network bridge -p 169.254.1.1:9090:9090 abrarov/tcp-echo --port 9090

The sidecar service for echo is started with below docker command:

docker run --rm -d --dns 169.254.1.1 --name echo-proxy --network host consul-envoy -sidecar-for echo -http-addr http://169.254.1.1:8500 -grpc-addr 169.254.1.1:8502 -- -l trace

This sidecar service is stuck trying to contact consul gRPC server and never gets a response.

Sidecar proxy trace logs:

``` [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:202] initializing epoch 0 (hot restart version=disabled) [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:204] statically linked extensions: [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:206] access_loggers: envoy.file_access_log,envoy.http_grpc_access_log [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:209] filters.http: envoy.buffer,envoy.cors,envoy.ext_authz,envoy.fault,envoy.filters.http.header_to_metadata,envoy.filters.http.jwt_authn,envoy.filters.http.rbac,envoy.grpc_http1_bridge,envoy.grpc_json_transcoder,envoy.grpc_web,envoy.gzip,envoy.health_check,envoy.http_dynamo_filter,envoy.ip_tagging,envoy.lua,envoy.rate_limit,envoy.router,envoy.squash [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:212] filters.listener: envoy.listener.original_dst,envoy.listener.proxy_protocol,envoy.listener.tls_inspector [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:215] filters.network: envoy.client_ssl_auth,envoy.echo,envoy.ext_authz,envoy.filters.network.rbac,envoy.filters.network.sni_cluster,envoy.filters.network.thrift_proxy,envoy.http_connection_manager,envoy.mongo_proxy,envoy.ratelimit,envoy.redis_proxy,envoy.tcp_proxy [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:217] stat_sinks: envoy.dog_statsd,envoy.metrics_service,envoy.stat_sinks.hystrix,envoy.statsd [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:219] tracers: envoy.dynamic.ot,envoy.lightstep,envoy.zipkin [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:222] transport_sockets.downstream: envoy.transport_sockets.capture,raw_buffer,tls [2019-08-03 18:38:26.583][6][info][main] source/server/server.cc:225] transport_sockets.upstream: envoy.transport_sockets.capture,raw_buffer,tls [2019-08-03 18:38:26.613][6][info][main] source/server/server.cc:264] admin address: 127.0.0.1:19000 [2019-08-03 18:38:26.613][6][info][config] source/server/configuration_impl.cc:50] loading 0 static secret(s) [2019-08-03 18:38:26.613][6][info][config] source/server/configuration_impl.cc:56] loading 1 cluster(s) [2019-08-03 18:38:26.614][29][debug][grpc] source/common/grpc/google_async_client_impl.cc:41] completionThread running [2019-08-03 18:38:26.614][6][debug][upstream] source/common/upstream/cluster_manager_impl.cc:815] adding TLS initial cluster local_agent [2019-08-03 18:38:26.614][6][trace][upstream] source/common/upstream/upstream_impl.cc:787] Local locality: [2019-08-03 18:38:26.614][6][debug][upstream] source/common/upstream/upstream_impl.cc:580] initializing secondary cluster local_agent completed [2019-08-03 18:38:26.614][6][debug][upstream] source/common/upstream/cluster_manager_impl.cc:952] membership update for TLS cluster local_agent [2019-08-03 18:38:26.614][6][debug][upstream] source/common/upstream/cluster_manager_impl.cc:90] cm init: init complete: cluster=local_agent primary=0 secondary=0 [2019-08-03 18:38:26.614][6][debug][upstream] source/common/upstream/cluster_manager_impl.cc:62] cm init: adding: cluster=local_agent primary=0 secondary=0 [2019-08-03 18:38:26.614][6][info][upstream] source/common/upstream/cluster_manager_impl.cc:131] cm init: initializing cds [2019-08-03 18:38:26.614][6][debug][upstream] source/common/config/grpc_mux_impl.cc:108] gRPC mux subscribe for type.googleapis.com/envoy.api.v2.Cluster [2019-08-03 18:38:26.614][6][debug][upstream] source/common/config/grpc_mux_impl.cc:56] No stream available to sendDiscoveryRequest for type.googleapis.com/envoy.api.v2.Cluster [2019-08-03 18:38:26.615][6][debug][upstream] source/common/config/grpc_mux_impl.cc:40] Establishing new gRPC bidi stream for rpc StreamAggregatedResources(stream .envoy.api.v2.DiscoveryRequest) returns (stream .envoy.api.v2.DiscoveryResponse); [2019-08-03 18:38:26.616][6][debug][router] source/common/router/router.cc:252] [C0][S1683339255971901603] cluster 'local_agent' match for URL '/envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources' [2019-08-03 18:38:26.616][6][debug][router] source/common/router/router.cc:303] [C0][S1683339255971901603] router decoding headers: ':method', 'POST' ':path', '/envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources' ':authority', 'local_agent' ':scheme', 'http' 'te', 'trailers' 'content-type', 'application/grpc' 'x-consul-token', '' 'x-envoy-internal', 'true' 'x-forwarded-for', '172.31.9.39' [2019-08-03 18:38:26.616][6][debug][client] source/common/http/codec_client.cc:25] [C0] connecting [2019-08-03 18:38:26.616][6][debug][connection] source/common/network/connection_impl.cc:632] [C0] connecting to 169.254.1.1:8502 [2019-08-03 18:38:26.616][6][debug][connection] source/common/network/connection_impl.cc:641] [C0] connection in progress [2019-08-03 18:38:26.616][6][debug][http2] source/common/http/http2/codec_impl.cc:632] [C0] setting stream-level initial window size to 268435456 [2019-08-03 18:38:26.616][6][debug][http2] source/common/http/http2/codec_impl.cc:654] [C0] updating connection-level initial window size to 268435456 [2019-08-03 18:38:26.616][6][debug][pool] source/common/http/http2/conn_pool.cc:97] [C0] creating stream [2019-08-03 18:38:26.616][6][debug][router] source/common/router/router.cc:981] [C0][S1683339255971901603] pool ready [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=24 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 24 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=21 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 21 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=4 [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=13 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 13 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=8 [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=157 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 157 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=1 [2019-08-03 18:38:26.616][6][trace][upstream] source/common/config/grpc_mux_impl.cc:85] Sending DiscoveryRequest for type.googleapis.com/envoy.api.v2.Cluster: node { id: "echo-sidecar-proxy" cluster: "echo" build_version: "5d25f466c3410c0dfa735d7d4358beb76b2da507/1.8.0/Clean/RELEASE" } type_url: "type.googleapis.com/envoy.api.v2.Cluster" [2019-08-03 18:38:26.616][6][trace][router] source/common/router/router.cc:881] [C0][S1683339255971901603] proxying 137 bytes [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 146 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=0 [2019-08-03 18:38:26.616][6][info][config] source/server/configuration_impl.cc:61] loading 0 listener(s) [2019-08-03 18:38:26.616][6][info][config] source/server/configuration_impl.cc:95] loading tracing configuration [2019-08-03 18:38:26.616][6][info][config] source/server/configuration_impl.cc:113] loading stats sink configuration [2019-08-03 18:38:26.616][6][info][main] source/server/server.cc:452] starting main dispatch loop [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:437] [C0] socket event: 3 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.616][6][debug][connection] source/common/network/connection_impl.cc:514] [C0] connected [2019-08-03 18:38:26.616][6][debug][client] source/common/http/codec_client.cc:63] [C0] connected [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/raw_buffer_socket.cc:62] [C0] write returns: 361 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:475] [C0] read ready [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: 15 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: -1 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/raw_buffer_socket.cc:29] [C0] read error: 11 [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:289] [C0] dispatching 15 bytes [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:347] [C0] recv frame type=4 [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:304] [C0] dispatched 15 bytes [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=9 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 9 bytes, end_stream false [2019-08-03 18:38:26.616][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=4 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:437] [C0] socket event: 2 [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.616][6][trace][connection] source/common/network/raw_buffer_socket.cc:62] [C0] write returns: 9 [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/connection_impl.cc:437] [C0] socket event: 3 [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/connection_impl.cc:475] [C0] read ready [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: 9 [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: -1 [2019-08-03 18:38:26.617][6][trace][connection] source/common/network/raw_buffer_socket.cc:29] [C0] read error: 11 [2019-08-03 18:38:26.617][6][trace][http2] source/common/http/http2/codec_impl.cc:289] [C0] dispatching 9 bytes [2019-08-03 18:38:26.617][6][trace][http2] source/common/http/http2/codec_impl.cc:347] [C0] recv frame type=4 [2019-08-03 18:38:26.617][6][trace][http2] source/common/http/http2/codec_impl.cc:304] [C0] dispatched 9 bytes [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:437] [C0] socket event: 3 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:475] [C0] read ready [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: 30 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/raw_buffer_socket.cc:21] [C0] read returns: -1 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/raw_buffer_socket.cc:29] [C0] read error: 11 [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:289] [C0] dispatching 30 bytes [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:347] [C0] recv frame type=8 [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:347] [C0] recv frame type=6 [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:304] [C0] dispatched 30 bytes [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:503] [C0] send data: bytes=17 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:374] [C0] writing 17 bytes, end_stream false [2019-08-03 18:38:26.618][6][trace][http2] source/common/http/http2/codec_impl.cc:457] [C0] sent frame type=6 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:437] [C0] socket event: 2 [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/connection_impl.cc:505] [C0] write ready [2019-08-03 18:38:26.618][6][trace][connection] source/common/network/raw_buffer_socket.cc:62] [C0] write returns: 17 [2019-08-03 18:38:31.617][6][debug][main] source/server/server.cc:139] flushing stats ```

I verified that consul grpc server is reachable and consul dns is resolving:

docker exec -ti echo-proxy bash
root@ip-172-31-9-39:/# nc -vz 169.254.1.1 8502
ip-172-31-9-39 [169.254.1.1] 8502 (?) open
root@ip-172-31-9-39:/# dig echo.service.consul

; <<>> DiG 9.10.3-P4-Ubuntu <<>> echo.service.consul
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46137
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;echo.service.consul.       IN  A

;; ANSWER SECTION:
echo.service.consul.    0   IN  A   172.31.9.39

;; ADDITIONAL SECTION:
echo.service.consul.    0   IN  TXT "consul-network-segment="

;; Query time: 0 msec
;; SERVER: 169.254.1.1#53(169.254.1.1)
;; WHEN: Sat Aug 03 18:49:27 UTC 2019
;; MSG SIZE  rcvd: 100

root@ip-172-31-9-39:/#

Envoy config dump of the proxy has no dynamic clusters/CA certificate details.

Config dump

``` { "configs": [ { "@type": "type.googleapis.com/envoy.admin.v2alpha.BootstrapConfigDump", "bootstrap": { "node": { "id": "echo-sidecar-proxy", "cluster": "echo", "build_version": "5d25f466c3410c0dfa735d7d4358beb76b2da507/1.8.0/Clean/RELEASE" }, "static_resources": { "clusters": [ { "name": "local_agent", "connect_timeout": "1s", "hosts": [ { "socket_address": { "address": "169.254.1.1", "port_value": 8502 } } ], "http2_protocol_options": {} } ] }, "dynamic_resources": { "lds_config": { "ads": {} }, "cds_config": { "ads": {} }, "ads_config": { "api_type": "GRPC", "grpc_services": [ { "envoy_grpc": { "cluster_name": "local_agent" }, "initial_metadata": [ { "key": "x-consul-token" } ] } ] } }, "admin": { "access_log_path": "/dev/null", "address": { "socket_address": { "address": "127.0.0.1", "port_value": 19000 } } }, "stats_config": { "stats_tags": [ { "tag_name": "local_cluster", "fixed_value": "echo" } ], "use_all_default_tags": true } }, "last_updated": "2019-08-03T18:38:26.612Z" }, { "@type": "type.googleapis.com/envoy.admin.v2alpha.ClustersConfigDump", "static_clusters": [ { "cluster": { "name": "local_agent", "connect_timeout": "1s", "hosts": [ { "socket_address": { "address": "169.254.1.1", "port_value": 8502 } } ], "http2_protocol_options": {} }, "last_updated": "2019-08-03T18:38:26.614Z" } ] }, { "@type": "type.googleapis.com/envoy.admin.v2alpha.ListenersConfigDump" } ] } ```

Consul UI shows sidecar health check failing for port :21000 which is expected as the listener is not yet started.

image

Sidecar health check:

image

Consul Server Info

``` agent: check_monitors = 0 check_ttls = 0 checks = 4 services = 4 build: prerelease = revision = a42ded47 version = 1.5.3 consul: acl = disabled bootstrap = true known_datacenters = 1 leader = true leader_addr = 172.31.9.39:8300 server = true raft: applied_index = 180 commit_index = 180 fsm_pending = 0 last_contact = 0 last_log_index = 180 last_log_term = 3 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:b6c7781a-5d26-d704-5eb4-e404302f98d3 Address:172.31.9.39:8300}] latest_configuration_index = 1 num_peers = 0 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 3 runtime: arch = amd64 cpu_count = 2 goroutines = 986 max_procs = 2 os = linux version = go1.12.1 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 1 event_time = 3 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1 members = 1 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 1 members = 1 query_queue = 0 query_time = 1 ```

envoy.json

```json { "admin": { "access_log_path": "/dev/null", "address": { "socket_address": { "address": "127.0.0.1", "port_value": 19000 } } }, "node": { "cluster": "echo", "id": "echo-sidecar-proxy" }, "static_resources": { "clusters": [ { "name": "local_agent", "connect_timeout": "1s", "type": "STATIC", "http2_protocol_options": {}, "hosts": [ { "socket_address": { "address": "169.254.1.1", "port_value": 8502 } } ] } ] }, "stats_config": { "stats_tags": [ { "tag_name": "local_cluster", "fixed_value": "echo" } ], "use_all_default_tags": true }, "dynamic_resources": { "lds_config": { "ads": {} }, "cds_config": { "ads": {} }, "ads_config": { "api_type": "GRPC", "grpc_services": { "initial_metadata": [ { "key": "x-consul-token", "value": "" } ], "envoy_grpc": { "cluster_name": "local_agent" } } } } } ```

xsreality commented 5 years ago

Turns out I forgot to enable connect! The guide runs Consul in dev mode which automatically enables connect but when not using dev mode by default it is disabled.

With connect enabled, issue is resolved.