emissary-ingress / emissary

open source Kubernetes-native API gateway for microservices built on the Envoy Proxy
https://www.getambassador.io
Apache License 2.0
4.39k stars 689 forks source link

Ambassador Proxying Broken under Calico in Default Configuration (w/ Workaround) #3237

Closed dmayle closed 1 year ago

dmayle commented 3 years ago

Describe the bug In the default configuration (following quick start instructions), proxying behavior is broken in a cluster configured with Calico and at least 2 nodes.

In the default configuration, the Ambassador service uses externalTrafficPolicy: Cluster which causes the load balancer to spread traffic across the nodes before kube-proxy routes it back to the node running ambassador.

When this traffic is received, half of it comes directly to the ambassador pod from the load balancer and is tagged as external traffic, and half of it comes via kube-proxy and is tagged as internal traffic, which causes problems for applications behind the proxy.

Change the type of the ambassador service to externalTrafficPolicy: Local and everything starts working as expected.

To Reproduce Steps to reproduce the behavior (with mitm proxy for observing traffic):

  1. Provision a 2-node cluster that uses Calico for its networking layer (I performed this on Scaleway)
  2. Follow the directions in the quick start docs to install ambassador edge stack:
    kubectl apply -f https://www.getambassador.io/yaml/aes-crds.yaml && \
    kubectl wait --for condition=established --timeout=90s crd -lproduct=aes && \
    kubectl apply -f https://www.getambassador.io/yaml/aes.yaml && \
    kubectl -n ambassador wait --for condition=available --timeout=90s deploy -lproduct=aes
  3. Get the public IP address:
    kubectl get -n ambassador service ambassador -o "go-template={{range .status.loadBalancer.ingress}}{{or .ip .hostname}}{{end}}"
  4. Create an external DNS mapping to be used for accessing the internal service: pgadmin.example.org to the IP address above
  5. Using the web interface, request a certificate via Let's Encrypt for the host
  6. Add the following mapping for that host:
    apiVersion: getambassador.io/v2
    kind: Mapping
    metadata:
    name: pgadmin
    spec:
    host: pgadmin.example.org
    prefix: /
    service: mitmweb-proxy
  7. Add pgadmin with a mitmweb proxy via this manifest:
    apiVersion: v1
    kind: Service
    metadata:
    name: pgadmin4
    spec:
    ports:
    - name: http
    port: 80
    protocol: TCP
    targetPort: 5050
    selector:
    app: pgadmin4
    type: ClusterIP
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    labels:
    name: pgadmin4
    name: pgadmin4
    spec:
    replicas: 1
    selector:
    matchLabels:
      app: pgadmin4
    template:
    metadata:
      labels:
        app: pgadmin4
    spec:
      containers:
      - name: pgadmin
        env:
        - name: PGADMIN_DEFAULT_EMAIL
          value: user@example.org
        - name: PGADMIN_DEFAULT_PASSWORD
          value: password
        - name: PGADMIN_LISTEN_PORT
          value: "5050"
        image: dpage/pgadmin4:4.30
        livenessProbe:
          initialDelaySeconds: 5
          periodSeconds: 5
          tcpSocket:
            port: 5050
        ports:
        - containerPort: 5050
          protocol: TCP
        volumeMounts:
        - mountPath: /var/lib/pgadmin
          name: pgadmin
          readOnly: false
        - mountPath: /run/httpd
          name: run
          readOnly: false
      securityContext:
        fsGroup: 5050
      volumes:
      - name: pgadmin
        emptyDir: {}
      - name: run
        emptyDir:
          medium: Memory
    ---
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: mitmweb
    spec:
    replicas: 1
    selector:
    matchLabels:
      app: mitmweb
    template:
    metadata:
      labels:
        app: mitmweb
    spec:
      containers:
      - name: mitmweb
        image: mitmproxy/mitmproxy:6.0.2
        command: [mitmweb]
        args:
        - -m
        - reverse:http://pgadmin4
        - --set
        - keep_host_header=true
        - --no-web-open-browser
        - --web-host
        - 0.0.0.0
        - --web-port
        - "8081"
        - --listen-host
        - 0.0.0.0
        - --listen-port
        - "8080"
        ports:
        - containerPort: 8080
          protocol: TCP
        - containerPort: 8081
          protocol: TCP
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: mitmweb-proxy
    spec:
    ports:
    - name: http
    port: 80
    protocol: TCP
    targetPort: 8080
    selector:
    app: mitmweb
    type: ClusterIP
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: mitmweb-web
    spec:
    ports:
    - name: http
    port: 80
    protocol: TCP
    targetPort: 8081
    selector:
    app: mitmweb
    type: ClusterIP
  8. Setup port forwarding to connect to mitm-web
    kubectl port-forward service/mitmweb-web 8080:80
  9. Wait 30 seconds to a minute for the services to be ready
  10. Open a web browser to http://localhost:8080
  11. Open a web browser in a new window to the DNS name from above (e.g. pgadmin.example.org)
  12. Try to login with username user@example.org and password password.

Expected behavior I expect to be able to login the the pgadmin service, but instead one of two errors happens:

As soon as you update the externalTrafficPolicy, login works

Versions (please complete the following information):

cindymullins-dw commented 2 years ago

Can you determine if this is specifically related to Calico and does it persist on 2.x?

cindymullins-dw commented 1 year ago

Feel free to reopen this issue if persisting on Ambassador 2.x or 3.x versions.