squat / kilo

Kilo is a multi-cloud network overlay built on WireGuard and designed for Kubernetes (k8s + wg = kg)
https://kilo.squat.ai
Apache License 2.0
2.02k stars 123 forks source link

Services not exposed via Kilo VPN #28

Closed skurfuerst closed 4 years ago

skurfuerst commented 4 years ago

Hey,

first, thanks for this awesome project! I am just getting started with it, and I was able to get it up and running quite easily.

I only want to use it within a Rancher 2 / Canal (Flannel + Calico) setup, to access cluster-local services via a VPN from an external site.

What I did

I ran the image with the following options:

        image: squat/kilo
        args:
        - --hostname=$(NODE_NAME)
        - --cni=false
        - --encapsulation=never
        - --compatibility=flannel
        - --local=false

(as I am running in-cluster, I don't need the kubeconfig option).

Then, I added a new peer and extracted its config via kgctl.

Problem Description

-> Observation: in my case the allowed IPs look like: 10.4.0.1/32, 10.19.0.5/32, 10.19.0.6/32, 10.19.0.7/32, 10.42.0.0/24, 10.42.1.0/24, 10.42.2.0/24, 10.43.0.0/24

However, the services are in the 10.43 IP range; so they are not included in the allowed IPs.

Do you have any hint how to debug further why the Service IP range is not included in the allowed IPs?

Thank you and all the best ❤️ Sebastian

squat commented 4 years ago

Hi Sebastian, I’m glad you got up and running easily, especially with a relatively advanced configuration.

When generating the local configuration for your peer via kgctl, the service IP CIDR is not included in the allowed IPs of any peer by design; this is simply because Kilo doesn’t have any concept of Kubernetes service ranges. Services IP ranges are not assigned to any node (unlike the Pod CIDR) so there is no objective way to specify a range as an allowed IP of one specific peer. For example, you have two Kilo nodes in your mesh, which one should have the 10.43/16 CIDR in the allowed IP?

For this reason, only the node IPs and Pod IPs are included. The way around this is to:

  1. generate the configuration for your peer, eg:
    kgctl showconf peer $PEER > peer.ini
  2. Then, choose one node in the mesh to be the gateway for the Service CIDR and manually add 10.43/16 to the allowed IPs of that peer in the peer.ini file.
  3. Next, set the configuration with wg.
  4. Once your WireGuard interface is up on your local host, you should ensure that the route is configured:
    ip route add 10.43/16 dev wg0

Please let me know if that answers your question!

Thanks for the support!

-Lucas

PS I use the VPN feature of Kilo every day to securely access cluster services. I find it is particularly useful when paired with the external DNS project (https://github.com/kubernetes-sigs/external-dns) as it allows me to securely access my services by DNS name. I’ve also been using it recently when developing new projects to give the cluster access to a container on my laptop. For example, while developing an application that requires object storage, I run Minio on my laptop and it can be accessed by both a test service in my cluster and containers on my host, allowing me to iterate faster without pushing images all the time.

skurfuerst commented 4 years ago

Hey, thanks for the really quick reply :):)

Awesome - I will try it out really soon.

One quick question (preparing a docs pull request): why do you actually mount the kube config from the host instead of relying on in-cluster service accounts? As far as I tried and checked so far, simply removing the kubeconfig command argument works fine :)

All the best, Sebastian

squat commented 4 years ago

Hey Sebastian,

The reason for mounting the host's Kubeconfig is because it uses an externally-resolvable DNS name in the server field, which is necessary if we want to build a cluster over a WAN; whereas service accounts use the in-cluster Kubernetes service as the server endpoint. When the nodes are not already connected with a private network, then the IP to which the in-cluster service resolves is not routable. For instance, I have one controller in AWS and one node in DigitalOcean, there is no private network between the two, so the Kubernetes service IP (eg 10.43.0.) is not routable from DigitalOcean. Because of this, Kilo cannot contact the API and configure WireGuard. On the other hand, if your cluster already has a private network, e.g. a cluster using Flannel that runs entirely in one cloud, and you only need Kilo for VPN access, then the Kubernetes service IP is already routable and you don't need the host's Kubeconfig.

Hope that answers your question, Lucas

skurfuerst commented 4 years ago

Hey Lucas,

aaah - awesome - that explains it. Thanks again for the quick response. I'll try the service forwards ASAP; and when this all works out, I'll push some docs updates for you to review :+1:

All the best, Sebastian

squat commented 4 years ago

Hi @skurfuerst have you had any luck with the VPN access to service IPs?

skurfuerst commented 4 years ago

@squat it has all worked, is production ready since 2 days :) :heart: :heart: and I even got Calico NetworkPolicies to work in this combination.

See the docs update proposal in #30 - and feel free to adjust it in any way you like.

Thanks again for your work :)

All the best, Sebastian

squat commented 4 years ago

@skurfuerst awesome! Thanks for the update

benosman commented 4 years ago

@skurfuerst

I am looking to use kilo in the same way as you, I'm using a Rancher rke / Canal cluster, which should be very similar to a Rancher 2 one.

Could you share the manifest you are using to install kilo?

I've tried to adapt the examples, including your suggested options, but I get an error like the following as soon as I add a peer:

"component":"kilo","err":"failed to read CNI config list file: error reading /etc/cni/net.d/10-kilo.conflist: open /etc/cni/net.d/10-kilo.conflist

Edit: I used your manifest as suggested in #30, but I still get the same error, so it looks like it there is something different with my setup, so I've created a new issue: https://github.com/squat/kilo/issues/36

Thanks,

Ben

skurfuerst commented 4 years ago

Hey @benosman,

are you sure that --cni=false is specified when starting the Kilo pod? Just be sure that this is really applied :)

All the best, Sebastian