Calico kube controller image is still being pulled from internet when specified in system_images from different registry #292

Open vojtechmares opened 3 years ago

vojtechmares commented 3 years ago


I am running a RKE cluster in air gapped environment.

Images are provided from internal registry (GitLab).

Every image is working just fine except calico_controllers which RKE tries to pull from public internet and not use provided image in system_images

My cluster definition:

resource "rke_cluster" "dev" {
  cluster_name   = "dev"
  ssh_key_path = "/root/.ssh/id_rsa"

  nodes {
    address           = "__REDACTED__"
    hostname_override = "master-1"
    user              = "root"
    role              = ["controlplane", "etcd"]
  nodes {
    address           = "__REDACTED__"
    hostname_override = "master-2"
    user              = "root"
    role              = ["controlplane", "etcd"]
  nodes {
    address           = "__REDACTED__"
    hostname_override = "master-3"
    user              = "root"
    role              = ["controlplane", "etcd"]
  nodes {
    address           = "__REDACTED__"
    hostname_override = "worker-1"
    user              = "root"
    role              = ["worker"]
  nodes {
    address           = "__REDACTED__"
    hostname_override = "worker-2"
    user              = "root"
    role              = ["worker"]
  nodes {
    address           = "__REDACTED__"
    hostname_override = "worker-3"
    user              = "root"
    role              = ["worker"]

  ingress {
    provider = "none"

  network {
    plugin = "canal"

  services {
    kube_api {
      audit_log {
        enabled = true

        configuration {
          path = "/var/log/kube-audit/audit-log.json"

  system_images {
    etcd                        = "__PRIVATE_REGISTRY__/rancher/coreos-etcd:v3.4.14-rancher1"
    alpine                      = "__PRIVATE_REGISTRY__/rancher/rke-tools:v0.1.72"
    nginx_proxy                 = "__PRIVATE_REGISTRY__/rancher/rke-tools:v0.1.72"
    cert_downloader             = "__PRIVATE_REGISTRY__/rancher/rke-tools:v0.1.72"
    kubernetes_services_sidecar = "__PRIVATE_REGISTRY__/rancher/rke-tools:v0.1.72"
    kube_dns                    = "__PRIVATE_REGISTRY__/rancher/k8s-dns-kube-dns:1.15.10"
    dnsmasq                     = "__PRIVATE_REGISTRY__/rancher/k8s-dns-dnsmasq-nanny:1.15.10"
    kube_dns_sidecar            = "__PRIVATE_REGISTRY__/rancher/k8s-dns-sidecar:1.15.10"
    kube_dns_autoscaler         = "__PRIVATE_REGISTRY__/rancher/cluster-proportional-autoscaler:1.8.1"
    coredns                     = "__PRIVATE_REGISTRY__/rancher/coredns-coredns:1.8.0"
    coredns_autoscaler          = "__PRIVATE_REGISTRY__rancher/cluster-proportional-autoscaler:1.8.1"
    nodelocal                   = "__PRIVATE_REGISTRY__/rancher/k8s-dns-node-cache:1.15.13"
    kubernetes                  = "__PRIVATE_REGISTRY__/rancher/hyperkube:v1.20.4-rancher1"
    flannel                     = "__PRIVATE_REGISTRY__/rancher/coreos-flannel:v0.13.0-rancher1"
    flannel_cni                 = "__PRIVATE_REGISTRY__/rancher/flannel-cni:v0.3.0-rancher6"
    calico_node                 = "__PRIVATE_REGISTRY__/rancher/calico-node:v3.17.2"
    calico_cni                  = "__PRIVATE_REGISTRY__/rancher/calico-cni:v3.17.2"
    calico_controllers          = "__PRIVATE_REGISTRY__/rancher/calico-kube-controllers:v3.17.2"
    calico_ctl                  = "__PRIVATE_REGISTRY__/rancher/calico-ctl:v3.17.2"
    calico_flex_vol             = "__PRIVATE_REGISTRY__/rancher/calico-pod2daemon-flexvol:v3.17.2"
    canal_node                  = "__PRIVATE_REGISTRY__/rancher/calico-node:v3.17.2"
    canal_cni                   = "__PRIVATE_REGISTRY__/rancher/calico-cni:v3.17.2"
    canal_flannel               = "__PRIVATE_REGISTRY__/rancher/coreos-flannel:v0.13.0-rancher1"
    canal_flex_vol              = "__PRIVATE_REGISTRY__/rancher/calico-pod2daemon-flexvol:v3.17.2"
    weave_node                  = "__PRIVATE_REGISTRY__/weaveworks/weave-kube:2.8.1"
    weave_cni                   = "__PRIVATE_REGISTRY__/weaveworks/weave-npc:2.8.1"
    pod_infra_container         = "__PRIVATE_REGISTRY__/rancher/pause:3.2"
    ingress                     = "__PRIVATE_REGISTRY__/rancher/nginx-ingress-controller:nginx-0.43.0-rancher1"
    ingress_backend             = "__PRIVATE_REGISTRY__/rancher/nginx-ingress-controller-defaultbackend:1.5-rancher1"
    metrics_server              = "__PRIVATE_REGISTRY__/rancher/metrics-server:v0.4.1"
    aci_cni_deploy_container    = "__PRIVATE_REGISTRY__/noiro/cnideploy:"
    aci_host_container          = "__PRIVATE_REGISTRY__/noiro/aci-containers-host:"
    aci_opflex_container        = "__PRIVATE_REGISTRY__/noiro/opflex:"
    aci_mcast_container         = "__PRIVATE_REGISTRY__/noiro/opflex:"
    aci_controller_container    = "__PRIVATE_REGISTRY__/aci-containers-controller:"



$ kubectl get po -n kube-system
NAME                                       READY   STATUS             RESTARTS   AGE
calico-kube-controllers-6c8ddcb6cd-nk6wf   0/1     ImagePullBackOff   0          76m
canal-8ht5t                                2/2     Running            0          76m
canal-cf24b                                2/2     Running            0          76m
canal-qhzrg                                2/2     Running            0          76m
canal-v22dk                                2/2     Running            0          76m
canal-vx8sj                                2/2     Running            0          76m
canal-xg648                                2/2     Running            0          76m
coredns-5ffc49c57b-5knv5                   1/1     Running            0          76m
coredns-5ffc49c57b-kvcll                   1/1     Running            0          76m
coredns-autoscaler-6c84d979b7-s5hpl        1/1     Running            0          76m
metrics-server-55b956b5cb-dhn2s            1/1     Running            0          76m
rke-coredns-addon-deploy-job-pzscl         0/1     Completed          0          76m
rke-metrics-addon-deploy-job-4f84q         0/1     Completed          0          76m
rke-network-plugin-deploy-job-cv7kz        0/1     Completed          0          76m
$ kubectl describe po -n kube-system calico-kube-controllers-6c8ddcb6cd-nk6wf
Name:                 calico-kube-controllers-6c8ddcb6cd-nk6wf
Namespace:            kube-system
Priority:             2000000000
Priority Class Name:  system-cluster-critical
Node:                 worker-3/
Start Time:           Thu, 08 Apr 2021 12:15:48 +0200
Labels:               k8s-app=calico-kube-controllers
Annotations:          cni.projectcalico.org/podIP:
Status:               Pending
Controlled By:  ReplicaSet/calico-kube-controllers-6c8ddcb6cd
    Container ID:
    Image:          rancher/calico-kube-controllers:v3.17.2
    Image ID:
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       ImagePullBackOff
    Ready:          False
    Restart Count:  0
    Readiness:      exec [/usr/bin/check-status -r] delay=0s timeout=1s period=10s #success=1 #failure=3
      DATASTORE_TYPE:       kubernetes
      /var/run/secrets/kubernetes.io/serviceaccount from calico-kube-controllers-token-jlqnr (ro)
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
    Type:        Secret (a volume populated by a Secret)
    SecretName:  calico-kube-controllers-token-jlqnr
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  kubernetes.io/os=linux
Tolerations:     :NoScheduleop=Exists
                 CriticalAddonsOnly op=Exists
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  Failed   16m (x261 over 76m)  kubelet  Error: ImagePullBackOff
  Normal   BackOff  92s (x325 over 76m)  kubelet  Back-off pulling image "rancher/calico-kube-controllers:v3.17.2"

I don't see a typo in the name of the image or typo in calico_controllers key.

Image is present in the docker registry

Why is still rke pulling the image from the public internet?

vojtechmares commented 3 years ago

Current workaround via Makefile:

    kubectl set -n kube-system image deployment/calico-kube-controllers calico-kube-controllers=__PRIVATE_REGISTRY__/rancher/calico-kube-controllers:v3.17.2
rawmind0 commented 3 years ago

Hello @vojtechmares , what tf provider version are you using?? Have you tried defining default private_registry on your rke cluster??