hashicorp / consul

Consul is a distributed, highly available, and data center aware solution to connect and configure applications across dynamic, distributed infrastructure.
https://www.consul.io
Other
28.4k stars 4.43k forks source link

connect: transparent proxy connection refused #12927

Open Poweranimal opened 2 years ago

Poweranimal commented 2 years ago

Overview of the Issue

Connection attempts from a connect service with transparent proxy via kube dns name to another connect proxy are refused. The connection attempts work, if an upstream is explicitly defined and bound to the localhost of the source.

Setup: StatefulSet with 2 replicas and nginx image. Deployment with 1 replica and alpine image. The alpine pod tries to request the nginx pod via the kube dns name of the nginx pod.

Reproduction

  1. Setup consul with connect and transparent proxy enabled by default.
  2. Deploy the below provided k8s manifest.
  3. Check the logs of the alpine Deployment. All wget -qO- http://nginx.default:8000-request fail with Connection refused.
k8s manifest ```yaml apiVersion: consul.hashicorp.com/v1alpha1 kind: ServiceIntentions metadata: name: nginx spec: destination: name: nginx sources: - action: allow name: alpine --- apiVersion: v1 kind: Service metadata: name: nginx spec: type: ClusterIP ports: - name: http port: 8000 targetPort: 8000 selector: app.kubernetes.io/instance: nginx app.kubernetes.io/name: nginx component: nginx --- apiVersion: v1 kind: Service metadata: name: alpine spec: type: ClusterIP clusterIP: None selector: app.kubernetes.io/instance: alpine app.kubernetes.io/name: alpine component: alpine --- apiVersion: v1 kind: ServiceAccount metadata: name: nginx --- apiVersion: v1 kind: ServiceAccount metadata: name: alpine --- apiVersion: v1 kind: ConfigMap metadata: name: nginx data: default.conf.template: | server { listen 8000; location / { return 200 'Hello world!'; add_header Content-Type text/plain; } } --- apiVersion: apps/v1 kind: StatefulSet metadata: name: nginx spec: selector: matchLabels: app.kubernetes.io/instance: nginx app.kubernetes.io/name: nginx component: nginx serviceName: nginx replicas: 2 updateStrategy: type: RollingUpdate template: metadata: name: nginx annotations: consul.hashicorp.com/connect-inject: "true" labels: app.kubernetes.io/instance: nginx app.kubernetes.io/name: nginx component: nginx spec: serviceAccountName: nginx containers: - name: nginx image: docker.io/nginx:1.21.6 imagePullPolicy: IfNotPresent ports: - containerPort: 8000 name: http protocol: TCP volumeMounts: - mountPath: /etc/nginx/templates name: config readOnly: true volumes: - name: config configMap: name: nginx --- apiVersion: apps/v1 kind: Deployment metadata: name: alpine spec: replicas: 1 strategy: type: RollingUpdate selector: matchLabels: app.kubernetes.io/instance: alpine app.kubernetes.io/name: alpine component: alpine template: metadata: name: alpine annotations: consul.hashicorp.com/connect-inject: "true" labels: app.kubernetes.io/instance: alpine app.kubernetes.io/name: alpine component: alpine spec: serviceAccountName: alpine containers: - name: alpine image: docker.io/alpine:3.15.3 command: - /bin/ash - -ec args: - 'while : ; do { wget -qO- http://nginx.default:8000 && echo ""; } || true ; sleep 10s ; done' ```

Consul info for both Client and Server

Client info ``` output from client 'consul info' command here ```
Server info ``` agent: check_monitors = 0 check_ttls = 0 checks = 0 services = 0 build: prerelease = revision = e319d7ed version = 1.11.3 consul: acl = enabled bootstrap = false known_datacenters = 1 leader = true leader_addr = 10.244.0.17:8300 server = true raft: applied_index = 14245 commit_index = 14245 fsm_pending = 0 last_contact = 0 last_log_index = 14245 last_log_term = 2 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:22b187ac-0d61-30d8-fccf-411bead6917c Address:10.244.0.17:8300} {Suffrage:Voter ID:0ed9a191-693f-6e6f-8b94-d7155ce2cb4f Address:10.244.1.25:8300}] latest_configuration_index = 0 num_peers = 1 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Leader term = 2 runtime: arch = amd64 cpu_count = 18 goroutines = 301 max_procs = 18 os = linux version = go1.17.5 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 5 members = 4 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 2 members = 2 query_queue = 0 query_time = 1 ```
Poweranimal commented 2 years ago

Apparently the transparent proxy mode only works, if the k8s Service belong to the Pod has a port mapping. Changing

apiVersion: v1
kind: Service
metadata:
  name: alpine
spec:
  type: ClusterIP
  clusterIP: None
  selector:
    app.kubernetes.io/instance: alpine
    app.kubernetes.io/name: alpine
    component: alpine

to

apiVersion: v1
kind: Service
metadata:
  name: alpine
spec:
  type: ClusterIP
  selector:
    app.kubernetes.io/instance: alpine
    app.kubernetes.io/name: alpine
    component: alpine
  ports:
    - port: 8000
      targetPort: 8000
      name: http

works.

However, I think one should ot be force to add a port mapping to Service, if the Pod doesn't expose a port.

FelipeEmerim commented 2 years ago

Did you enable dialedDirectlty for that service? It seems you have to when using headless services.

https://www.consul.io/docs/connect/transparent-proxy#headless-services

Edit: Since your problem is in outgoing requests I don't think my suggestion would apply.

Amier3 commented 2 years ago

Hey @Poweranimal

Are you still experiencing this issue? If not did you figure out what was going on ?

huikang commented 2 years ago

I think having a service with an ip and port is required to participate in the service mesh with transparent proxy. If you look at the log of the envoy_sidecar of the alpine container, you will see the following line after you enable the cluster ip and port number in the alpine deployment:

[2022-06-15 04:19:20.490][1][info][upstream] [source/server/lds_api.cc:77] 
lds: add/update listener 'outbound_listener:127.0.0.1:15001'

at which port the sidecar accepts outgoing traffic.

coconut30 commented 2 years ago

Same issue. Why headless services cannot participate to the service mesh ?