linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.6k stars 1.27k forks source link

IPv6 semantics differ from Kubernetes without Linkerd #12733

Open howardjohn opened 3 months ago

howardjohn commented 3 months ago

What is the issue?

If a workload is created in a dual stack cluster, with a dual stack service, but the pod does NOT listen on IPv6, all traffic will fail. Without linkerd this generally works since the client will happy-eyeballs between the two ipfamilies (or maybe not; but curl does).

With linkerd, it appears regardless of the IP family of the incoming request, the IPv6 pod IP will always be used.

How can it be reproduced?

apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
spec:
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      securityContext:
        sysctls:
        - name: net.ipv4.ip_unprivileged_port_start
          value: "0"
      containers:
      - name: echo
        image: gcr.io/istio-testing/app:latest
        imagePullPolicy: IfNotPresent
        args:
        - --port=80
        - --bind-ip=80
        env:
        - name: INSTANCE_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
---
apiVersion: v1
kind: Service
metadata:
  name: echo
spec:
  ipFamilies:
  - IPv6
  - IPv4
  ipFamilyPolicy: RequireDualStack
  selector:
    app: echo
  ports:
  - name: http
    port: 80

Example app above that only binds to IPv4 address.

Logs, error output, etc

Error is a 502 gateway error. Sorry I tore down the env so don't have the full log. linkerd diagnostics endpoints only shows one IPv6 address.

output of linkerd check -o short

n/a

Environment

v1.30.0 kind edge-24.6.1

Possible solution

I am not sure really what is the right behavior. Part of what led me down this path was exploring how we should behave in this scenario in Istio. I thought it might be helpful to bring up in case this was unexpected, or warrants some document, etc.

Additional context

No response

Would you like to work on fixing this bug?

no

alpeb commented 3 months ago

Thanks for the detailed testing @howardjohn As a first approach for IPv6 support in Linkerd we don't do happy-eyeballs and when there are both IPv4 and IPv6 EndpointSlices for a target, discovery simply gives precedence to the IPv6 one. Unfortunately the kubelet probes don't allow to easily configure separate probes for both families (it appears KEP-4559 might be addressing this) so there's gonna be EndpointSlices with ready addresses for both families even if only the IPv4 one is ready. For now, we're assuming that if a service is declared as dual-stack then there are processes listening on both families. We definitely need to add these nuances into the upcoming IPv6 docs.

howardjohn commented 3 months ago

Oh one other complexity I forgot to mention... I didn't actually verify this, but I believe it will work this way:

Client's doing happy eyeballs is broken because linkerd will accept any connection, and then close the connection/send and error if it cannot process it. But happy eyeballs algorithm just checks if the connection was established.

So reliance on "use the IP family of the client" is not a great path unfortunately

alpeb commented 3 months ago

It seems you're right about the happy eyeballs behavior, according to my testing. And yeah, relying on the client's IP family shouldn't be factored in unless under very specific scenarios. We could surface that as an option in the future depending on user demand. In this first iteration of IPv6 support we're focusing on basic support with no tweaks exposed. But this is great feedback :+1:

stale[bot] commented 1 week ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.