linkerd / linkerd2

Ultralight, security-first service mesh for Kubernetes. Main repo for Linkerd 2.x.
https://linkerd.io
Apache License 2.0
10.51k stars 1.27k forks source link

Add Support For Hybrid Windows/Linux Clusters #3298

Open cartyc opened 4 years ago

cartyc commented 4 years ago

Feature Request

Linkerd to support hybrid kubernetes environments (Windows/Linux)

What problem are you trying to solve?

It would be great to be able to use Linkerd in hybrid cluster environments and have windows deployments as part of the mesh.

How should the problem be solved?

It would be great if linkerd proxy worked in a windows env.

Any alternatives you've considered?

I have not tried anything else at the moment.

How would users interact with this feature?

I would image it would remain the same as the current experience. At least I do not expect a major change, maybe add a windows flag to the installer to specify the ENV.

wmorgan commented 4 years ago

@grampelberg anyone we can tag to give us the status of this effort?

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

adamcarter81 commented 4 years ago

I just wondered how this is progressing - I'm also looking to deploy linkerd on a mixed mode cluster. What is the status of getting this to work on Windows nodes?

grampelberg commented 4 years ago

@adamcarter81 Windows still doesn't support it. Hopefully 1H 2020. There's great work going on, just kinda blocked on the actual networking support.

k3daevin commented 3 years ago

How is the status of the hybrid cluster implementation? May I offer help?

beingamarnath commented 3 years ago

Any updates on the progress here? Or is it even planned? Now that Envoy support for windows is in GA, wondering what's the plan for Linkerd on this avenue.

RichiCoder1 commented 3 years ago

Any updates on the progress here? Or is it even planned? Now that Envoy support for windows is in GA, wondering what's the plan for Linkerd on this avenue.

LinkerD doesn't use Envoy, so that's kind of moot

beingamarnath commented 3 years ago

LinkerD doesn't use Envoy, so that's kind of moot

Yes, I know LinkerD uses linkerd proxy. Just for the progress comparison I brought up about Envoy.. 😝

Type1J commented 2 years ago

What needs to be done to add Windows support to the LinkerD proxy?

wmorgan commented 2 years ago

AIUI the issue is not running Linkerd2-proxy on Windows (which is not difficult) but the fact that Windows networking does not support the TCP redirecting that Linkerd's init-container requires. I heard a rumor that Microsoft was adding such support in 2021, but I haven't heard anything about it landing. We would need that support in order to make progress.

Note this is not Linkerd-specific. Any service mesh that uses iptables-style TCP redirecting has this same limitation on Windows today.

Type1J commented 2 years ago

Does anybody know of a link to where the progress of Windows TCP redirecting may be?

TBBle commented 2 years ago

I'm not really sure I'm understanding this correctly, but is the https://github.com/Microsoft/ebpf-for-windows XDP support the right sort of direction to be looking for the needed features? I guess it doesn't currently have the necessary hooks though.

Type1J commented 2 years ago

is the https://github.com/Microsoft/ebpf-for-windows XDP support the right sort of direction to be looking for the needed features?

Maybe? I'm not really sure how ipchains are implemented in Linux, but Linux does use eBPF in the kernel.

For those who don't know, eBPF is a virtual machine (like the JVM, but more like WASM) targeted by a language like C or Rust that allows network traffic to be "filtered" or controlled in some way. It's a VM to allow less-than-kernel-trusted code to run in isolation performing network filtering tasks.

I guess it doesn't currently have the necessary hooks though.

Does anybody know what would be needed here, or if this is even the right road to travel?

TBBle commented 2 years ago

While iptables isn't implemented using eBPF in Linux, you can implement iptables-equivalent functionality using eBPF in Linux, e.g. https://github.com/mbertrone/bpf-iptables. Cilium provides an eBPF-based replacement for kube-proxy, which is implemented on Linux using iptables (or legacy usermode 'bind the relevant socket and forward those packets').

The thing I'm not sure of is whether the eBPF in Windows is sufficient, i.e even if sufficient support exists in the Windows networking stack, is it exposed via eBPF on Windows yet? I assume the functionality needed is what's done by https://github.com/linkerd/linkerd2-proxy-init/blob/master/iptables/iptables.go.


I also noticed that kube-proxy supports a Windows Kernel feature "VFP" (it also has usermode support using netsh portproxy), but I'm assuming linkerd2-proxy actually needs more than this offers, since that's been in kube-proxy since 2017, so predates the comments about Windows lacking necessary features.

Or maybe VFP (and/or WFP) would be sufficient if they could operate on the relevant network flows, but the flows linkerd needs to redirect are not visible to these platforms as they are relatively internal compared to what kube-proxy manages.

VFP/WPF API for containers is visible at https://github.com/microsoft/hcsshim/blob/master/hcn/hcnpolicy.go for reference.

Since kube-proxy and CNI both run on the host on Windows, I assume that same setup would be needed for linkerd, i.e. it's really linkerd-cni, not linkerd-proxy-init that needs to be gotten working on Windows, since we can't really spawn a new host-side process for every Pod.

vrapolinario commented 1 year ago

Hey folks, MSFT person here. I just tried the tutorial on a mixed AKS cluster (Windows and Linux) with a sample IIS app that is currently served by a ngnix ingress and the step-by-step fails when launching the proxy-init. Here's the status:

linkerd-data-plane
------------------
√ data plane namespace exists
× data plane proxies are ready
    pod "iis-app-routing-58ff54f66b-lgdwm" status is Pending
    see https://linkerd.io/2.12/checks/#l5d-data-plane-ready for hints

Looking at the pods on the app namespace:

vinicius [ ~ ]$ kubectl get pods -n iissampleapp
NAME                                        READY   STATUS                  RESTARTS   AGE
iis-app-routing-58ff54f66b-lgdwm            0/2     Init:ImagePullBackOff   0          25m
iis-app-routing-5f85cb4b47-b9rn6            1/1     Running                 0          48m
keyvault-iis-app-routing-557c7bf745-87976   2/2     Running                 0          25m

Then, looking at the faulty pod:

Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  25m                  default-scheduler  Successfully assigned iissampleapp/iis-app-routing-58ff54f66b-lgdwm to akswspool000001
  Normal   Pulling    23m (x4 over 25m)    kubelet            Pulling image "cr.l5d.io/linkerd/proxy-init:v2.0.0"
  Warning  Failed     23m (x4 over 25m)    kubelet            Error: ErrImagePull
  Normal   BackOff    26s (x108 over 25m)  kubelet            Back-off pulling image "cr.l5d.io/linkerd/proxy-init:v2.0.0"

Any ideas on when this will be fixed for Windows? We have customers using AKS with Windows pods, ingress for HTTPS traffinc, but looking at Service Mesh options and LinkerD would be a nice fit.

TBBle commented 1 year ago

As far as I knew a year ago, implementing linkerd (or any service mesh) on Windows depends on being able to inject things (container outgoing packet redirection like the nat table in Linux iptables) into the network stack that may not be exposed on Windows at this time.

Type1J commented 1 year ago

There's quite a few people wanting service meshes (LinkerD, Istio, or anything else), but the Windows container runtimes (and maybe Windows itself) hasn't exposed a way to do the networking manipulation needed for a service mesh to work. Istio's ambient mode looks more promising since the proxy is node-wide, not per-Pod. In theory, one could pair a Windows node with a Linux node that proxies the traffic, but that modus operandi isn't yet supported. I'm hoping LinkerD, in an effort to compete with ambient, does something similar soon (reducing load on the nodes due to service mesh activity), and fix this issue for Windows in the process.

TBBle commented 1 year ago

I'm not sure that ambient mode in particular would help, my understanding is that Windows does (or did) not expose the fundamental network operations needed to redirect traffic to the proxy; if the limitation were simply that we can't have the proxy in the same pod as the service, or we can't do the network setup from inside a container, then a linkerd-cni implementation on Windows could solve this as CNI already runs on the host (or a Host Process pod, which is equivalent) in Windows.

See also https://linkerd.io/2022/12/28/service-mesh-2022-recap-ebpf-gateway-api/

AndreaPQ commented 1 year ago

See the "Redirection Policy Comparison" Iptables on linux , HNS policy on windows Time 07:40

Service Mesh using Envoy on Windows - S. Nanopoulos, P. Balasubramanian, K. Subramanian, N Jackson https://www.youtube.com/watch?v=ggvaAbjx4jo

And hcnproxyctl Host Container Networking Proxy Controller is a high-level library and executable that allows users to program layer-4 proxy policies on Windows through the Host Networking Service (HNS). It is intended to be used as part of a service mesh to redirect all traffic in a given network compartment through a sidecar proxy. https://github.com/microsoft/hcnproxyctrl

wmorgan commented 1 year ago

Thanks @AndreaPQ! That's a great find.

AndreaPQ commented 11 months ago

Thanks @AndreaPQ! That's a great find.

Any planned roadmap ?

wmorgan commented 10 months ago

It is likely we are going to start looking at the effort involved in this sometime next year. Feel free to ping me on the Linkerd slack for more details.

Kmdkca commented 5 months ago

@wmorgan any update on this?

wmorgan commented 3 months ago

We have done some initial groundwork for this feature. Support for Windows nodes continues to be on the roadmap for this year.