vmware / cluster-api-provider-cloud-director

Cluster API Provider for VMware Cloud Director. The project is an open source implementation of K8s ClusterAPI project and allows customers to provision resources directly from VMware Cloud Director. It enables Cloud Director powered Clouds to be treated as yet-another-cloud in the multi-cloud journey for VMware Cloud Providers.
Apache License 2.0
38 stars 36 forks source link

`ProxyConfigSpec` also configures environment variables of the control plane components #299

Open bavarianbidi opened 2 years ago

bavarianbidi commented 2 years ago

Describe the bug

A set ProxyConfigSpec in the vcdcluster object does not even configure the proxy for containerd (as documented) it also configures the pod manifest files of the control plane components and inject the proxy variable there.

As the variables HTTP_PROXY/HTTPS_PROXY and NO_PROXY are exported to the shell where kubeadm is getting called (xref: cloud_init.tmpl), kubeadm is discovering the proxy variables and injecting these to the control plane manifests.

If noProxy isn't configured (as not needed when only fetching images from public registries through a proxy), the kubernetes control-plane components are failling as the make use of the HTTPS_PROXY and HTTP_PROXY variable for every request as NO_PROXY is empty.

Reproduction steps

  1. Creating a cluster where only httpProxy and httpsProxy is configured in vcdcluster.spec.proxyConfig
  2. if a in-cluster webhook is applied, the API-server tries to reach this webhook via the proxy-server

Expected behavior

According to the documentation that only containerd is configured for the proxy, the control-plane components should work with an unset NO_PROXY variable.

Additional context

bavarianbidi commented 2 years ago

One addition: during the initial definition of the proxy-variables i didn't know the apiserver loadbalancer IP. It means i have to create the cluster first with the proxy variables (and hope that the proxy-server itself is able to reach the kubernetes api server). and once the cluster got created i have to set the api-server IP to the noProxy list. This behavior could lead to a chicken-egg problem.

arunmk commented 2 years ago

@bavarianbidi do you mean that kubeadm should not use the environment variables and only containerd should use it?

bavarianbidi commented 2 years ago

@bavarianbidi do you mean that kubeadm should not use the environment variables and only containerd should use it?

@arunmk that's what i expect reading the documentation (kubectl explain vcdcluster.spec.proxyconfig).

bavarianbidi commented 2 years ago

And it might be also a chicken-egg problem: if the k8s-api loadbalancer is generated with a dynamic ip i might not know during the cluster-creation, the noProxy value doesn't contain that ip. This might lead to the behavior, that kube-scheduler, kube-controller are connecting to the api-server to the loadbalancer IP via the configured proxy server. And if the proxyserver isn't able to route traffic to that IP, scheduler and controller manager aren't able to start.

bavarianbidi commented 2 years ago

Just FYI: We currently removed the build-in proxy configuration (link to internal PR) and refering to a proxy-config for containerd from a secret in the CAPI management cluster:

apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
[...]
spec:
  kubeadmConfigSpec:
    files:
      - path: /etc/systemd/system/containerd.service.d/99-http-proxy.conf
        permissions: "0600"
        contentFrom:
          secret:
            name: mycluster-cluster-values-containerd-proxy
            key: containerdProxy    
    preKubeadmCommands:
    - systemctl daemon-reload
    - systemctl restart containerd

@arunmk would you accept a PR to drop this build in proxy configuration out of the embedded cloud-init.conf.tmpl file? If so would you also aggree an removing the proxy config out of the vcdcluster.spec?

arunmk commented 1 year ago

@bavarianbidi apologies for the delay. Can you please publish a PR.

bavarianbidi commented 1 year ago

@bavarianbidi apologies for the delay. Can you please publish a PR.

Hi @arunmk,

i've created draft PR #356. ATM only the logic inside the built-in cloud-init phase got removed and therefor ProxyConfigSpec became obsolete.

As CAPVCD API is now v1beta1 and as ProxyConfigSpec is part of vcdcluster.spec we should introduce a new API-Version first. Is this also something you want me to do?