I am currently getting a TLS issue when trying to communicate between two of my services (router -> initial-context) while using sidecars/transparent proxy. The request is being made from the router to the initial context service using the following URL: http://initial-context.service.consul/graph.
Relevant log output from the initial context service's consul-dataplane
```txt
2024-04-14T18:45:42.598Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"20791"] new connection from 10.244.1.241:47238
2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] remote address:10.244.1.241:47238,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end
2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] closing socket: 0
2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] remote address:10.244.1.241:47238,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end:TLS_error_end
2024-04-14T18:45:42.599Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"20791"] adding to cleanup list
2024-04-14T18:45:46.272Z+00:00 [debug] envoy.main(15) flushing stats
```
So far, everything I've tried has not worked, and when I search Google for the error I find remarkably few results.
How can I fix this to get the mesh (with TLS verification) working?
consul validate
```txt
consul validate /consul/config
"autopilot.disable_upgrade_migration" is a Consul Enterprise configuration and will have no effect
BootstrapExpect is set to 1; this is the same as Bootstrap mode.
bootstrap = true: do not enable unless necessary
if auto_encrypt.allow_tls is turned on, tls.internal_rpc.verify_incoming should be enabled (either explicitly or via tls.defaults.verify_incoming). It is necessary to turn it off during a migration to TLS, but it should definitely be turned on afterwards.
Configuration is valid!
```
consul-k8s status
```txt
consul-k8s status
==> Consul Status Summary
Name Namespace Status Chart Version AppVersion Revision Last Updated
consul consul deployed 1.4.1 1.18.1 6 2024/04/12 22:41:42 UTC
==> Config:
(see config below in Helm Configuration)
==> Status Of Helm Hooks:
consul-gossip-encryption-autogenerate ServiceAccount: Succeeded
consul-tls-init ServiceAccount: Succeeded
consul-gossip-encryption-autogenerate Role: Succeeded
consul-tls-init Role: Succeeded
consul-gossip-encryption-autogenerate RoleBinding: Succeeded
consul-tls-init RoleBinding: Succeeded
consul-gossip-encryption-autogenerate Job: Succeeded
consul-tls-init Job: Succeeded
Consul servers healthy 1/1
```
consul-k8s troubleshoot upstreams (router)
```txt
consul-k8s troubleshoot upstreams -pod router-1234abcd -n my-namespace
==> Upstreams (explicit upstreams only) (0)
==> Upstream IPs (transparent proxy only) (6)
IPs Virtual Cluster Names
10.245.103.70, 240.0.0.3 true service1.default.(...).consul
10.245.187.44, 240.0.0.5 true service2.default.(...).consul
10.245.228.123, 240.0.0.2 true initial-context.default.(...).consul
10.245.45.217, 240.0.0.6 true service3.default.(...).consul
10.245.6.148, 240.0.0.1 true service4.default.(...).consul
10.245.63.230, 240.0.0.4 true service5.default.(...).consul
If you cannot find the upstream address or cluster for a transparent proxy upstream:
-> Check intentions: Transparent proxy upstreams are configured based on intentions. Make sure you have configured intentions to allow traffic to your upstream.
-> To check that the right cluster is being dialed, run a DNS lookup for the upstream you are dialing. For example, run `dig backend.svc.consul` to return the IP address for the `backend` service. If the address you get from that is missing from the upstream IPs, it means that your proxy may be misconfigured.
```
consul-k8s troubleshoot proxy (router --> initial-context)
```txt
consul-k8s troubleshoot proxy -upstream-ip 10.245.228.123 -pod router-1234abcd -n my-namespace
==> Validation
✓ Certificates are valid
✓ Envoy has 0 rejected configurations
✓ Envoy has detected 0 connection failure(s)
✓ Listener for upstream "10.245.228.123" found
✓ Cluster "initial-context.default.(...).consul" for upstream "10.245.228.123" found
✓ Healthy endpoints for cluster "initial-context.default.(...).consul" for upstream "10.245.228.123" found
✓ Upstream resources are valid
```
If there are more command outputs you would like to see, let me know and I will add them here.
Relevant logs from router (consul-dataplane)
I have verified that 10.244.0.66 is the initial-context pod IP.
```txt
2024-04-15T02:44:21.237Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=151
2024-04-15T02:44:21.248Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=151
2024-04-15T02:44:21.259Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=141
2024-04-15T02:44:21.270Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=141
2024-04-15T02:44:21.281Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=137
2024-04-15T02:44:21.335Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=137
2024-04-15T02:44:21.338Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=64
2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) original_dst: set destination to 10.244.0.66:80
2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] new tcp proxy session
2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] Creating connection to cluster original-destination
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(24) transport socket match, socket default selected for host with address 10.244.0.66:80
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(24) Created host original-destination10.244.0.66:80 10.244.0.66:80.
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.misc(24) Allocating TCP conn pool
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) addHost() adding original-destination10.244.0.66:80 10.244.0.66:80.
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) membership update for TLS cluster original-destination added 1 removed 0
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) re-creating local LB for TLS cluster original-destination
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(23) membership update for TLS cluster original-destination added 1 removed 0
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(23) re-creating local LB for TLS cluster original-destination
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.pool(24) trying to create new connection
2024-04-15T02:44:21.340Z+00:00 [debug] envoy.pool(24) creating a new connection (connecting=0)
2024-04-15T02:44:21.341Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connecting to 10.244.0.66:80
2024-04-15T02:44:21.341Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connection in progress
2024-04-15T02:44:21.341Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"17831"] new connection from 10.244.1.241:52984
2024-04-15T02:44:21.341Z+00:00 [debug] envoy.upstream(24) membership update for TLS cluster original-destination added 1 removed 0
2024-04-15T02:44:21.342Z+00:00 [debug] envoy.upstream(24) re-creating local LB for TLS cluster original-destination
2024-04-15T02:44:21.345Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connected
2024-04-15T02:44:21.345Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] attaching to next stream
2024-04-15T02:44:21.345Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] creating stream
2024-04-15T02:44:21.345Z+00:00 [debug] envoy.router(24) Attached upstream connection [C17832] to downstream connection [C17831]
2024-04-15T02:44:21.346Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] TCP:onUpstreamEvent(), requestedServerName:
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.router(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] upstream headers complete: end_stream=false
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.http(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] encoding headers via codec (end_stream=false):
':status', '200'
'content-type', 'application/json'
'vary', 'origin'
'content-encoding', 'gzip'
'access-control-allow-origin', '*'
'date', 'Mon, 15 Apr 2024 02:44:21 GMT'
'x-envoy-upstream-service-time', '151'
'server', 'envoy'
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.client(23) [Tags: "ConnectionId":"17817"] response complete
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.http(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] Codec completed encoding stream.
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17831"] remote close
2024-04-15T02:44:21.349Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17831"] closing socket: 0
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] closing data_to_write=0 type=0
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] closing socket: 1
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] client disconnected, failure reason:
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) invoking idle callbacks - is_draining_for_deletion_=false
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] destroying stream: 0 remaining
2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) invoking idle callbacks - is_draining_for_deletion_=false
```
Relevant logs from initial-context (consul-dataplane)
I have verified that 10.244.1.241 is the router pod IP.
```txt
2024-04-15T02:50:25.545Z+00:00 [debug] envoy.conn_handler(23) [Tags: "ConnectionId":"24941"] new connection from 10.244.1.241:52194
2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] remote address:10.244.1.241:52194,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end
2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] closing socket: 0
2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] remote address:10.244.1.241:52194,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end:TLS_error_end
2024-04-15T02:50:25.632Z+00:00 [debug] envoy.conn_handler(23) [Tags: "ConnectionId":"24941"] adding to cleanup list
```
Complete logs from initial-context (consul-connect-inject-init)
I have verified that 10.244.1.234 is the consul-server pod IP.
```txt
2024-04-13T02:18:19.129Z [INFO] consul-server-connection-manager: trying to connect to a Consul server
2024-04-13T02:18:19.231Z [DEBUG] consul-server-connection-manager: Resolved DNS name: name=consul-server.consul.svc ip-addrs=["{10.244.1.234 }"]
2024-04-13T02:18:19.231Z [INFO] consul-server-connection-manager: discovered Consul servers: addresses=[10.244.1.234:8502]
2024-04-13T02:18:19.231Z [INFO] consul-server-connection-manager: current prioritized list of known Consul servers: addresses=[10.244.1.234:8502]
2024-04-13T02:18:19.231Z [DEBUG] consul-server-connection-manager: switching to Consul server: address=10.244.1.234:8502
2024-04-13T02:18:19.828Z [INFO] consul-server-connection-manager: ACL auth method login succeeded: accessorID=559ce7f4-8331-4462-8caa-220cbfecea6d
2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_EDGE_CERTIFICATE_MANAGEMENT
2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_ENVOY_BOOTSTRAP_CONFIGURATION
2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=false name=DATAPLANE_FEATURES_FIPS
2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_WATCH_SERVERS
2024-04-13T02:18:19.925Z [INFO] consul-server-connection-manager: connected to Consul server: address=10.244.1.234:8502
2024-04-13T02:18:20.036Z [INFO] Registered service has been detected: service=initial-context
2024-04-13T02:18:20.036Z [INFO] Registered service has been detected: service=initial-context-sidecar-proxy
2024-04-13T02:18:20.036Z [INFO] consul-server-connection-manager: stopping
2024-04-13T02:18:20.132Z [INFO] consul-server-connection-manager: ACL auth method logout succeeded
2024-04-13T02:18:20.132Z [DEBUG] consul-server-connection-manager: backoff: retry after=638.791208ms
2024-04-13T02:18:20.132Z [DEBUG] consul-server-connection-manager: aborting: error="context canceled"
2024-04-13T02:18:21.029Z [INFO] Successfully applied traffic redirection rules
2024-04-13T02:18:21.029Z [INFO] Connect initialization completed
2024-04-13T02:18:21.029Z [INFO] consul-server-connection-manager: stopping
```
Complete logs from router (consul-connect-inject-init)
I have verified that 10.244.1.234 is the consul-server pod IP.
```txt
2024-04-13T02:18:12.926Z [INFO] consul-server-connection-manager: trying to connect to a Consul server
2024-04-13T02:18:12.934Z [DEBUG] consul-server-connection-manager: Resolved DNS name: name=consul-server.consul.svc ip-addrs=["{10.244.1.234 }"]
2024-04-13T02:18:12.934Z [INFO] consul-server-connection-manager: discovered Consul servers: addresses=[10.244.1.234:8502]
2024-04-13T02:18:12.934Z [INFO] consul-server-connection-manager: current prioritized list of known Consul servers: addresses=[10.244.1.234:8502]
2024-04-13T02:18:12.934Z [DEBUG] consul-server-connection-manager: switching to Consul server: address=10.244.1.234:8502
2024-04-13T02:18:13.329Z [INFO] consul-server-connection-manager: ACL auth method login succeeded: accessorID=a7565bec-cfd9-c3a7-5b8d-4b5b72210b38
2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_WATCH_SERVERS
2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_EDGE_CERTIFICATE_MANAGEMENT
2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_ENVOY_BOOTSTRAP_CONFIGURATION
2024-04-13T02:18:13.332Z [DEBUG] consul-server-connection-manager: feature: supported=false name=DATAPLANE_FEATURES_FIPS
2024-04-13T02:18:13.332Z [INFO] consul-server-connection-manager: connected to Consul server: address=10.244.1.234:8502
2024-04-13T02:18:13.340Z [INFO] Registered service has been detected: service=router
2024-04-13T02:18:13.341Z [INFO] Registered service has been detected: service=router-sidecar-proxy
2024-04-13T02:18:13.341Z [INFO] consul-server-connection-manager: stopping
2024-04-13T02:18:13.347Z [INFO] consul-server-connection-manager: ACL auth method logout succeeded
2024-04-13T02:18:13.347Z [DEBUG] consul-server-connection-manager: backoff: retry after=383.794392ms
2024-04-13T02:18:13.347Z [DEBUG] consul-server-connection-manager: aborting: error="context canceled"
2024-04-13T02:18:14.726Z [INFO] Successfully applied traffic redirection rules
2024-04-13T02:18:14.925Z [INFO] Connect initialization completed
2024-04-13T02:18:14.925Z [INFO] consul-server-connection-manager: stopping
```
Current understanding and Expected behavior
My understanding is that Consul should automatically create and manage the TLS certs and distribute those certs to the Envoy pods. Additionally, this process should work, and I should not need to interfere with this process at all beyond configuration.
Environment details
Values.yaml used is included above.
consul-k8s: v1.4.0
consul:
Consul v1.18.1
Revision 98cb473c
Build Date 2024-03-26T21:59:08Z
Protocol 2 spoken by default, understands 2 to 3 (agent will automatically use protocol >2 when speaking to compatible agents)
Kubernetes is DigitalOcean (DOKS) v1.29.1-do.0.
Cilium, CoreDNS, CSI, Hubble, and Konnectivity are installed and managed by DigitalOcean.
I have deployed Traefik as the ingress controller.
All infrastructure has been deployed using Terraform.
@craigbehnke Your application is trying to connect directly to the pod IP instead of the cluster IP for the K8s service, 10.245.228.123.
The traffic from the downstream proxy is being routed through the original-destination cluster which passes the connection directly through as a plain TCP connection without encryption. The upstream subsequently throws an error because it is expecting to receive TLS traffic, but instead is receiving an unencrypted connection from the downstream.
There's two potential solutions to this.
Configure your application to connect to the upstream using the cluster IP address from the K8s Service.
Enable transparentProxy.dialedDirectly=true in the ServiceDefaults config for the upstream service. This will allow downstream pods to access individual service instances by connecting to the pod IPs instead of the Service address.
As an aside, is it better to target the service IP instead of the pod IP? If so, what would I change to get that to work? (I thought that the {svc-name}.service.consul address would resolve to that service-level resource)
@craigbehnke You can either look up the service using the Kubernetes Service DNS name, or the Consul virtual IP address using the <name>.virtual.consul hostname. Either lookup will return an IP that allows Envoy to correctly match the incoming connection to the correct upstream service and route it appropriately.
Question
I am currently getting a TLS issue when trying to communicate between two of my services (router -> initial-context) while using sidecars/transparent proxy. The request is being made from the router to the initial context service using the following URL:
http://initial-context.service.consul/graph
.Relevant log output from the initial context service's consul-dataplane
```txt 2024-04-14T18:45:42.598Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"20791"] new connection from 10.244.1.241:47238 2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] remote address:10.244.1.241:47238,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end 2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] closing socket: 0 2024-04-14T18:45:42.599Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"20791"] remote address:10.244.1.241:47238,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end:TLS_error_end 2024-04-14T18:45:42.599Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"20791"] adding to cleanup list 2024-04-14T18:45:46.272Z+00:00 [debug] envoy.main(15) flushing stats ```So far, everything I've tried has not worked, and when I search Google for the error I find remarkably few results.
How can I fix this to get the mesh (with TLS verification) working?
CLI Commands (consul-k8s, consul-k8s-control-plane, helm)
consul validate
```txt consul validate /consul/config "autopilot.disable_upgrade_migration" is a Consul Enterprise configuration and will have no effect BootstrapExpect is set to 1; this is the same as Bootstrap mode. bootstrap = true: do not enable unless necessary if auto_encrypt.allow_tls is turned on, tls.internal_rpc.verify_incoming should be enabled (either explicitly or via tls.defaults.verify_incoming). It is necessary to turn it off during a migration to TLS, but it should definitely be turned on afterwards. Configuration is valid! ```consul-k8s status
```txt consul-k8s status ==> Consul Status Summary Name Namespace Status Chart Version AppVersion Revision Last Updated consul consul deployed 1.4.1 1.18.1 6 2024/04/12 22:41:42 UTC ==> Config: (see config below in Helm Configuration) ==> Status Of Helm Hooks: consul-gossip-encryption-autogenerate ServiceAccount: Succeeded consul-tls-init ServiceAccount: Succeeded consul-gossip-encryption-autogenerate Role: Succeeded consul-tls-init Role: Succeeded consul-gossip-encryption-autogenerate RoleBinding: Succeeded consul-tls-init RoleBinding: Succeeded consul-gossip-encryption-autogenerate Job: Succeeded consul-tls-init Job: Succeeded Consul servers healthy 1/1 ```consul-k8s troubleshoot upstreams (router)
```txt consul-k8s troubleshoot upstreams -pod router-1234abcd -n my-namespace ==> Upstreams (explicit upstreams only) (0) ==> Upstream IPs (transparent proxy only) (6) IPs Virtual Cluster Names 10.245.103.70, 240.0.0.3 true service1.default.(...).consul 10.245.187.44, 240.0.0.5 true service2.default.(...).consul 10.245.228.123, 240.0.0.2 true initial-context.default.(...).consul 10.245.45.217, 240.0.0.6 true service3.default.(...).consul 10.245.6.148, 240.0.0.1 true service4.default.(...).consul 10.245.63.230, 240.0.0.4 true service5.default.(...).consul If you cannot find the upstream address or cluster for a transparent proxy upstream: -> Check intentions: Transparent proxy upstreams are configured based on intentions. Make sure you have configured intentions to allow traffic to your upstream. -> To check that the right cluster is being dialed, run a DNS lookup for the upstream you are dialing. For example, run `dig backend.svc.consul` to return the IP address for the `backend` service. If the address you get from that is missing from the upstream IPs, it means that your proxy may be misconfigured. ```consul-k8s troubleshoot proxy (router --> initial-context)
```txt consul-k8s troubleshoot proxy -upstream-ip 10.245.228.123 -pod router-1234abcd -n my-namespace ==> Validation ✓ Certificates are valid ✓ Envoy has 0 rejected configurations ✓ Envoy has detected 0 connection failure(s) ✓ Listener for upstream "10.245.228.123" found ✓ Cluster "initial-context.default.(...).consul" for upstream "10.245.228.123" found ✓ Healthy endpoints for cluster "initial-context.default.(...).consul" for upstream "10.245.228.123" found ✓ Upstream resources are valid ```If there are more command outputs you would like to see, let me know and I will add them here.
Helm Configuration
values.yaml
```yaml global: name: consul logLevel: "debug" recursors: - "8.8.8.8" - "8.8.4.4" metrics: enabled: true enableAgentMetrics: true enableHostMetrics: true enableGatewayMetrics: true enableTelemetryCollector: false gossipEncryption: autoGenerate: true tls: enabled: true enableAutoEncrypt: true verify: true acls: manageSystemACLs: true syncCatalog: enabled: true default: false toConsul: true toK8s: false server: replicas: 1 connect: true exposeService: enabled: false client: enable: true dns: enabled: true enableRedirection: true ui: enabled: true connectInject: enabled: true default: true aclBindingRuleSelector: "" transparentProxy: defaultEnabled: true defaultOverwriteProbes: true apiGateway: manageExternalCRDs: true manageNonStandardCRDs: false sidecarProxy: resources: requests: cpu: 25m memory: 60Mi limits: cpu: 50m memory: 60Mi namespaceSelector: | matchLabels: consul-enabled : enabled metrics: defaultEnabled: true defaultEnableMerging: true defaultPrometheusScrapePort: 20200 defaultPrometheusScrapePath: "/metrics" ingressGateways: enabled: false terminatingGateways: defaults: resources: requests: memory: "70Mi" cpu: "50m" limits: memory: "70Mi" cpu: "50m" enabled: true logLevel: "trace" telemetryCollector: enabled: false ```Logs
Relevant logs from router (consul-dataplane)
I have verified that 10.244.0.66 is the initial-context pod IP. ```txt 2024-04-15T02:44:21.237Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=151 2024-04-15T02:44:21.248Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=151 2024-04-15T02:44:21.259Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=141 2024-04-15T02:44:21.270Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=141 2024-04-15T02:44:21.281Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=137 2024-04-15T02:44:21.335Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=137 2024-04-15T02:44:21.338Z [DEBUG] consul-dataplane.dns-proxy.udp: dns messaged received from consul: length=64 2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) original_dst: set destination to 10.244.0.66:80 2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] new tcp proxy session 2024-04-15T02:44:21.339Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] Creating connection to cluster original-destination 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(24) transport socket match, socket default selected for host with address 10.244.0.66:80 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(24) Created host original-destination10.244.0.66:80 10.244.0.66:80. 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.misc(24) Allocating TCP conn pool 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) addHost() adding original-destination10.244.0.66:80 10.244.0.66:80. 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) membership update for TLS cluster original-destination added 1 removed 0 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(15) re-creating local LB for TLS cluster original-destination 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(23) membership update for TLS cluster original-destination added 1 removed 0 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.upstream(23) re-creating local LB for TLS cluster original-destination 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.pool(24) trying to create new connection 2024-04-15T02:44:21.340Z+00:00 [debug] envoy.pool(24) creating a new connection (connecting=0) 2024-04-15T02:44:21.341Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connecting to 10.244.0.66:80 2024-04-15T02:44:21.341Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connection in progress 2024-04-15T02:44:21.341Z+00:00 [debug] envoy.conn_handler(24) [Tags: "ConnectionId":"17831"] new connection from 10.244.1.241:52984 2024-04-15T02:44:21.341Z+00:00 [debug] envoy.upstream(24) membership update for TLS cluster original-destination added 1 removed 0 2024-04-15T02:44:21.342Z+00:00 [debug] envoy.upstream(24) re-creating local LB for TLS cluster original-destination 2024-04-15T02:44:21.345Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] connected 2024-04-15T02:44:21.345Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] attaching to next stream 2024-04-15T02:44:21.345Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] creating stream 2024-04-15T02:44:21.345Z+00:00 [debug] envoy.router(24) Attached upstream connection [C17832] to downstream connection [C17831] 2024-04-15T02:44:21.346Z+00:00 [debug] envoy.filter(24) [Tags: "ConnectionId":"17831"] TCP:onUpstreamEvent(), requestedServerName: 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.router(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] upstream headers complete: end_stream=false 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.http(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] encoding headers via codec (end_stream=false): ':status', '200' 'content-type', 'application/json' 'vary', 'origin' 'content-encoding', 'gzip' 'access-control-allow-origin', '*' 'date', 'Mon, 15 Apr 2024 02:44:21 GMT' 'x-envoy-upstream-service-time', '151' 'server', 'envoy' 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.client(23) [Tags: "ConnectionId":"17817"] response complete 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.http(23) [Tags: "ConnectionId":"17816","StreamId":"11384318054642815300"] Codec completed encoding stream. 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17831"] remote close 2024-04-15T02:44:21.349Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17831"] closing socket: 0 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] closing data_to_write=0 type=0 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.connection(24) [Tags: "ConnectionId":"17832"] closing socket: 1 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] client disconnected, failure reason: 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) invoking idle callbacks - is_draining_for_deletion_=false 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) [Tags: "ConnectionId":"17832"] destroying stream: 0 remaining 2024-04-15T02:44:21.350Z+00:00 [debug] envoy.pool(24) invoking idle callbacks - is_draining_for_deletion_=false ```Relevant logs from initial-context (consul-dataplane)
I have verified that 10.244.1.241 is the router pod IP. ```txt 2024-04-15T02:50:25.545Z+00:00 [debug] envoy.conn_handler(23) [Tags: "ConnectionId":"24941"] new connection from 10.244.1.241:52194 2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] remote address:10.244.1.241:52194,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end 2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] closing socket: 0 2024-04-15T02:50:25.632Z+00:00 [debug] envoy.connection(23) [Tags: "ConnectionId":"24941"] remote address:10.244.1.241:52194,TLS_error:|268435612:SSL routines:OPENSSL_internal:HTTP_REQUEST:TLS_error_end:TLS_error_end 2024-04-15T02:50:25.632Z+00:00 [debug] envoy.conn_handler(23) [Tags: "ConnectionId":"24941"] adding to cleanup list ```Complete logs from initial-context (consul-connect-inject-init)
I have verified that 10.244.1.234 is the consul-server pod IP. ```txt 2024-04-13T02:18:19.129Z [INFO] consul-server-connection-manager: trying to connect to a Consul server 2024-04-13T02:18:19.231Z [DEBUG] consul-server-connection-manager: Resolved DNS name: name=consul-server.consul.svc ip-addrs=["{10.244.1.234 }"] 2024-04-13T02:18:19.231Z [INFO] consul-server-connection-manager: discovered Consul servers: addresses=[10.244.1.234:8502] 2024-04-13T02:18:19.231Z [INFO] consul-server-connection-manager: current prioritized list of known Consul servers: addresses=[10.244.1.234:8502] 2024-04-13T02:18:19.231Z [DEBUG] consul-server-connection-manager: switching to Consul server: address=10.244.1.234:8502 2024-04-13T02:18:19.828Z [INFO] consul-server-connection-manager: ACL auth method login succeeded: accessorID=559ce7f4-8331-4462-8caa-220cbfecea6d 2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_EDGE_CERTIFICATE_MANAGEMENT 2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_ENVOY_BOOTSTRAP_CONFIGURATION 2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=false name=DATAPLANE_FEATURES_FIPS 2024-04-13T02:18:19.925Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_WATCH_SERVERS 2024-04-13T02:18:19.925Z [INFO] consul-server-connection-manager: connected to Consul server: address=10.244.1.234:8502 2024-04-13T02:18:20.036Z [INFO] Registered service has been detected: service=initial-context 2024-04-13T02:18:20.036Z [INFO] Registered service has been detected: service=initial-context-sidecar-proxy 2024-04-13T02:18:20.036Z [INFO] consul-server-connection-manager: stopping 2024-04-13T02:18:20.132Z [INFO] consul-server-connection-manager: ACL auth method logout succeeded 2024-04-13T02:18:20.132Z [DEBUG] consul-server-connection-manager: backoff: retry after=638.791208ms 2024-04-13T02:18:20.132Z [DEBUG] consul-server-connection-manager: aborting: error="context canceled" 2024-04-13T02:18:21.029Z [INFO] Successfully applied traffic redirection rules 2024-04-13T02:18:21.029Z [INFO] Connect initialization completed 2024-04-13T02:18:21.029Z [INFO] consul-server-connection-manager: stopping ```Complete logs from router (consul-connect-inject-init)
I have verified that 10.244.1.234 is the consul-server pod IP. ```txt 2024-04-13T02:18:12.926Z [INFO] consul-server-connection-manager: trying to connect to a Consul server 2024-04-13T02:18:12.934Z [DEBUG] consul-server-connection-manager: Resolved DNS name: name=consul-server.consul.svc ip-addrs=["{10.244.1.234 }"] 2024-04-13T02:18:12.934Z [INFO] consul-server-connection-manager: discovered Consul servers: addresses=[10.244.1.234:8502] 2024-04-13T02:18:12.934Z [INFO] consul-server-connection-manager: current prioritized list of known Consul servers: addresses=[10.244.1.234:8502] 2024-04-13T02:18:12.934Z [DEBUG] consul-server-connection-manager: switching to Consul server: address=10.244.1.234:8502 2024-04-13T02:18:13.329Z [INFO] consul-server-connection-manager: ACL auth method login succeeded: accessorID=a7565bec-cfd9-c3a7-5b8d-4b5b72210b38 2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_WATCH_SERVERS 2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_EDGE_CERTIFICATE_MANAGEMENT 2024-04-13T02:18:13.331Z [DEBUG] consul-server-connection-manager: feature: supported=true name=DATAPLANE_FEATURES_ENVOY_BOOTSTRAP_CONFIGURATION 2024-04-13T02:18:13.332Z [DEBUG] consul-server-connection-manager: feature: supported=false name=DATAPLANE_FEATURES_FIPS 2024-04-13T02:18:13.332Z [INFO] consul-server-connection-manager: connected to Consul server: address=10.244.1.234:8502 2024-04-13T02:18:13.340Z [INFO] Registered service has been detected: service=router 2024-04-13T02:18:13.341Z [INFO] Registered service has been detected: service=router-sidecar-proxy 2024-04-13T02:18:13.341Z [INFO] consul-server-connection-manager: stopping 2024-04-13T02:18:13.347Z [INFO] consul-server-connection-manager: ACL auth method logout succeeded 2024-04-13T02:18:13.347Z [DEBUG] consul-server-connection-manager: backoff: retry after=383.794392ms 2024-04-13T02:18:13.347Z [DEBUG] consul-server-connection-manager: aborting: error="context canceled" 2024-04-13T02:18:14.726Z [INFO] Successfully applied traffic redirection rules 2024-04-13T02:18:14.925Z [INFO] Connect initialization completed 2024-04-13T02:18:14.925Z [INFO] consul-server-connection-manager: stopping ```Current understanding and Expected behavior
My understanding is that Consul should automatically create and manage the TLS certs and distribute those certs to the Envoy pods. Additionally, this process should work, and I should not need to interfere with this process at all beyond configuration.
Environment details
Values.yaml used is included above.
consul-k8s: v1.4.0
consul:
Kubernetes is DigitalOcean (DOKS) v1.29.1-do.0.
Cilium, CoreDNS, CSI, Hubble, and Konnectivity are installed and managed by DigitalOcean.
I have deployed Traefik as the ingress controller.
All infrastructure has been deployed using Terraform.
Additional Context
Router -> Initial Context ServiceIntention
```yaml apiVersion: consul.hashicorp.com/v1alpha1 kind: ServiceIntentions metadata: name: initial-context namespace: my-namespace spec: destination: name: initial-context sources: - name: router permissions: - action: allow http: methods: - GET - POST pathExact: /graph ```@craigbehnke Your application is trying to connect directly to the pod IP instead of the cluster IP for the K8s service,
10.245.228.123
.The traffic from the downstream proxy is being routed through the
original-destination
cluster which passes the connection directly through as a plain TCP connection without encryption. The upstream subsequently throws an error because it is expecting to receive TLS traffic, but instead is receiving an unencrypted connection from the downstream.There's two potential solutions to this.
transparentProxy.dialedDirectly=true
in theServiceDefaults
config for the upstream service. This will allow downstream pods to access individual service instances by connecting to the pod IPs instead of the Service address.@blake You are a lifesaver! Thank you very much!
As an aside, is it better to target the service IP instead of the pod IP? If so, what would I change to get that to work? (I thought that the
{svc-name}.service.consul
address would resolve to that service-level resource)@craigbehnke You can either look up the service using the Kubernetes Service DNS name, or the Consul virtual IP address using the
<name>.virtual.consul
hostname. Either lookup will return an IP that allows Envoy to correctly match the incoming connection to the correct upstream service and route it appropriately.