siderolabs / talos

Talos Linux is a modern Linux distribution built for Kubernetes.
https://www.talos.dev
Mozilla Public License 2.0
6.01k stars 488 forks source link

KubePrism does not append members discovered via the kubernetes registry #8143

Closed vaskozl closed 6 months ago

vaskozl commented 6 months ago

Bug Report

I decided to try out KubePrism today and enabled the kubernetes registry as the docs state that cluster discovery is used for the endopints

KubePrism controller iterates over members and only appends them if the ControlPlane struct is not nil

However, when using the kubernetes discovery service only machineType is set, so controlPlane is always nil.

The controlPlane struct with the port is only set when using the discovery service (and not the kubernetes discovery registry).

At present the metadata in the node annotations does not specify the port so that would need to be added.

Environment

      discovery:
        enabled: true
        registries:
          kubernetes:
            disabled: false
          service:
            disabled: true
buroa commented 6 months ago

Confirmed, I am also seeing this.

JJGadgets commented 6 months ago

Can confirm I also see this, here's some (IPs censored, TS = Tailscale on Talos) outputs that seem to indicate so:

❯ talosctl get kubeprismconfig
NODE     NAMESPACE   TYPE              ID                        VERSION   HOST        PORT   ENDPOINTS
cp1-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   3         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp1-IP","port":6443},{"host":"cp1-TS-IPv4","port":6443}]
cp2-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   3         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp2-IP","port":6443},{"host":"cp2-TS-IPv4","port":6443}]
cp3-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   4         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp3-IP","port":6443},{"host":"VIP-IP","port":6443},{"host":"cp3-TS-IPv4","port":6443}]
❯ talosctl get kubeprismendpoint
NODE     NAMESPACE   TYPE                ID            VERSION   HOSTS                                           PORTS
cp1-IP   k8s         KubePrismEndpoint   k8s-cluster   3         vip.fqdn localhost cp1-IP cp1-TS-IPv4           6443 6443 6443 6443 6443
cp2-IP   k8s         KubePrismEndpoint   k8s-cluster   3         vip.fqdn localhost cp2-IP cp2-TS-IPv4           6443 6443 6443 6443 6443
cp3-IP   k8s         KubePrismEndpoint   k8s-cluster   4         vip.fqdn localhost cp3-IP VIP-IP cp3-TS-IPv4    6443 6443 6443 6443 6443 6443
smira commented 6 months ago

Kubernetes service discovery itself requires Kubernetes API access via KubePrism, so it's a loop, and we never got it implemented in the Kubernetes registry. It can be done, but I wouldn't recommend to use it, just use discovery service.

vaskozl commented 6 months ago

I was wondering how a loop would be handled (would it remember the last discovered endpoints if your external lb failed?), but assumed it would work fine with the VIP.

I still see the benefit when the external cluster endpoint is set to the vip, you can benefit from the latency based lb-ing of the go-loadbalncer though right?

That way even if you lose internet you still share the load between active backends for the static hostnetwork pods.

smira commented 6 months ago

KubePrism anyways uses the controlplane endpoint (e.g. VIP), but the problem is that it becomes unavailable, no updates will come from the discovery, so no way to learn other routes.

vaskozl commented 6 months ago

My reasoning is:

In this way KubePrism effectively replaces a highly available external loadbalancer, and keeps working even if you are airgapped/lose internet.

By not supporting the k8s registry, when you don't have access to the public discovery service you lose balancing. E.g. gateway/router failure.

In other words it would be valuable to have kubeprism keep balancing traffic without external dependencies.

smira commented 6 months ago

Makes sense, it should be relatively easy to implement.