kubernetes / kops

Kubernetes Operations (kOps) - Production Grade k8s Installation, Upgrades and Management
https://kops.sigs.k8s.io/
Apache License 2.0
15.66k stars 4.61k forks source link

Protokube - error applying channel #16046

Open ScOut3R opened 8 months ago

ScOut3R commented 8 months ago

/kind bug

I am not sure if this is a bug or just an issue with my system. Any help or suggestion would be greatly appreciated.

1. What kops version are you running? The command kops version, will display this information.

1.27.1

2. What Kubernetes version are you running? kubectl version will print the version if a cluster is running or provide the Kubernetes version specified as a kops flag.

1.27.7

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

The issue might have started earlier, but I noticed it while upgrading the cluster to 1.27.7 from 1.26.9 and kops from 1.26.6 to 1.27.1.

5. What happened after the commands executed?

Protokube tries to load the bootstrap-channel.yaml from S3, but logs kube_boot.go:89] error applying channel "s3://k8skopsstatestoreb8ef09a2-bucket83908e77-mmtvj3p0tin3/dev.kops.playhq.cloud/addons/bootstrap-channel.yaml": error running channels: exit status 1

6. What did you expect to happen?

I expected no error.

7. Please provide your cluster manifest. Execute kops get --name my.example.com -o yaml to display your cluster manifest. You may want to remove your cluster name and other sensitive information.

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
spec:
  api:
    loadBalancer:
      type: Internal
      class: Network
      idleTimeoutSeconds: 60
  authorization:
    rbac: {}
  awsLoadBalancerController:
    enabled: true
  certManager:
    enabled: true
    managed: false
  channel: stable
  cloudConfig:
    manageStorageClasses: false
    awsEBSCSIDriver:
      enabled: true
      managed: false
  cloudProvider: aws
  clusterAutoscaler:
    enabled: true
    expander: least-waste
    balanceSimilarNodeGroups: true
    awsUseStaticInstanceList: false
    scaleDownUtilizationThreshold: "0.7"
    skipNodesWithLocalStorage: false
    skipNodesWithSystemPods: false
    newPodScaleUpDelay: 10s
    scaleDownDelayAfterAdd: 10m0s
    cpuRequest: "100m"
    memoryRequest: "300Mi"
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - instanceGroup: master-ap-southeast-2a
      name: a
      encryptedVolume: true
      volumeType: gp3
    - instanceGroup: master-ap-southeast-2b
      name: b
      encryptedVolume: true
      volumeType: gp3
    - instanceGroup: master-ap-southeast-2c
      name: c
      encryptedVolume: true
      volumeType: gp3
    memoryRequest: 2Gi
    name: main
    manager:
      env:
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8081
        - name: ETCD_METRICS
          value: basic
  - cpuRequest: 100m
    etcdMembers:
    - instanceGroup: master-ap-southeast-2a
      name: a
      encryptedVolume: true
      volumeType: gp3
    - instanceGroup: master-ap-southeast-2b
      name: b
      encryptedVolume: true
      volumeType: gp3
    - instanceGroup: master-ap-southeast-2c
      name: c
      encryptedVolume: true
      volumeType: gp3
    memoryRequest: 1Gi
    name: events
    manager:
      env:
        - name: ETCD_LISTEN_METRICS_URLS
          value: http://0.0.0.0:8082
        - name: ETCD_METRICS
          value: basic
  iam:
    allowContainerRegistry: true
    legacy: false
    useServiceAccountExternalPermissions: true
  kubelet:
    anonymousAuth: false
    authenticationTokenWebhook: true
    authorizationMode: Webhook
    enforceNodeAllocatable: "pods"
    featureGates:
      InTreePluginAWSUnregister: "true"
    imageGCHighThresholdPercent: 75
    imageGCLowThresholdPercent: 70
    kubeReserved:
        cpu: "200m"
        memory: "1Gi"
        ephemeral-storage: "1Gi"
    kubeReservedCgroup: "/kube-reserved"
    kubeletCgroups: "/kube-reserved"
    podPidsLimit: 1024
    runtimeCgroups: "/kube-reserved"
    systemReserved:
        cpu: "200m"
        memory: "1Gi"
        ephemeral-storage: "1Gi"
    systemReservedCgroup: "/system-reserved"
  kubeAPIServer:
    memoryRequest: 8Gi
    memoryLimit: 8Gi
    eventTTL: 6h0m0s
    featureGates:
      InTreePluginAWSUnregister: "true"
  kubeControllerManager:
    featureGates:
      InTreePluginAWSUnregister: "true"
  kubernetesVersion: 1.27.7
  kubeDNS:
    provider: CoreDNS
    nodeLocalDNS:
      enabled: true
      memoryRequest: 30Mi
      cpuRequest: 25m
  kubeProxy:
    proxyMode: ipvs
  kubeScheduler:
    featureGates:
      InTreePluginAWSUnregister: "true"
  metricsServer:
    enabled: true
    insecure: false
  networking:
    calico:
      encapsulationMode: vxlan
      mtu: 8951
      typhaReplicas: 3
  nodeProblemDetector:
    enabled: true
    memoryRequest: 32Mi
    cpuRequest: 10m
  nodeTerminationHandler:
    enabled: true
    enableSQSTerminationDraining: true
    managedASGTag: "aws-node-termination-handler/managed"
  # https://tools.ietf.org/html/rfc6598#section-7
  nonMasqueradeCIDR: 100.64.0.0/10
  podIdentityWebhook:
    enabled: true
  serviceAccountIssuerDiscovery:
    enableAWSOIDCProvider: true
  snapshotController:
    enabled: true

And the bootstrap-channel.yaml file:

kind: Addons
metadata:
  creationTimestamp: null
  name: bootstrap
spec:
  addons:
  - id: k8s-1.16
    manifest: kops-controller.addons.k8s.io/k8s-1.16.yaml
    manifestHash: ff67e831a385c35e288a7794a3a370264742d9f7c32a16c139acb6db52db1997
    name: kops-controller.addons.k8s.io
    needsRollingUpdate: control-plane
    selector:
      k8s-addon: kops-controller.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.12
    manifest: coredns.addons.k8s.io/k8s-1.12.yaml
    manifestHash: 9e04bbb03f9e5e62e8a246a96f73e8aaa31bf1d3284346c335774df5f7c3495d
    name: coredns.addons.k8s.io
    selector:
      k8s-addon: coredns.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.9
    manifest: kubelet-api.rbac.addons.k8s.io/k8s-1.9.yaml
    manifestHash: 01c120e887bd98d82ef57983ad58a0b22bc85efb48108092a24c4b82e4c9ea81
    name: kubelet-api.rbac.addons.k8s.io
    selector:
      k8s-addon: kubelet-api.rbac.addons.k8s.io
    version: 9.99.0
  - manifest: limit-range.addons.k8s.io/v1.5.0.yaml
    manifestHash: 2d55c3bc5e354e84a3730a65b42f39aba630a59dc8d32b30859fcce3d3178bc2
    name: limit-range.addons.k8s.io
    selector:
      k8s-addon: limit-range.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.12
    manifest: dns-controller.addons.k8s.io/k8s-1.12.yaml
    manifestHash: c2abd7a216533f237e4c425a8890abc6822fa773ec7c52b3fc434b3e394619c6
    name: dns-controller.addons.k8s.io
    selector:
      k8s-addon: dns-controller.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.12
    manifest: nodelocaldns.addons.k8s.io/k8s-1.12.yaml
    manifestHash: 7af8058a8032e1f46dffeecd0f1539d628bf525cadb17a9e375c0c27176c441c
    name: nodelocaldns.addons.k8s.io
    needsRollingUpdate: all
    selector:
      k8s-addon: nodelocaldns.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.15
    manifest: cluster-autoscaler.addons.k8s.io/k8s-1.15.yaml
    manifestHash: a447e25a6a3793b7d0205abcbe55a5410949f8152de7f2656376b0decc04998f
    name: cluster-autoscaler.addons.k8s.io
    selector:
      k8s-addon: cluster-autoscaler.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.11
    manifest: metrics-server.addons.k8s.io/k8s-1.11.yaml
    manifestHash: ad25beb6dd7f245793d022efb0d142ff72744cd28c1ff0c17e064db4379c9382
    name: metrics-server.addons.k8s.io
    needsPKI: true
    selector:
      k8s-app: metrics-server
    version: 9.99.0
  - id: k8s-1.11
    manifest: node-termination-handler.aws/k8s-1.11.yaml
    manifestHash: 463f0bd528db7ddccd0b2c13ac4ae054af0017abb80728f77e2b50a47253c743
    name: node-termination-handler.aws
    prune:
      kinds:
      - kind: ConfigMap
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - kind: Service
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - kind: ServiceAccount
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: admissionregistration.k8s.io
        kind: MutatingWebhookConfiguration
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: admissionregistration.k8s.io
        kind: ValidatingWebhookConfiguration
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: apps
        kind: DaemonSet
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: apps
        kind: Deployment
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: apps
        kind: StatefulSet
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: policy
        kind: PodDisruptionBudget
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: rbac.authorization.k8s.io
        kind: ClusterRole
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: ClusterRoleBinding
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: Role
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: RoleBinding
        labelSelector: addon.kops.k8s.io/name=node-termination-handler.aws,app.kubernetes.io/managed-by=kops
    selector:
      k8s-addon: node-termination-handler.aws
    version: 9.99.0
  - id: k8s-1.17
    manifest: node-problem-detector.addons.k8s.io/k8s-1.17.yaml
    manifestHash: baae1afe4f61e36b96e7d77bd76c9c33c7c0de281c21fea1d36949ca731dd533
    name: node-problem-detector.addons.k8s.io
    selector:
      k8s-addon: node-problem-detector.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.19
    manifest: aws-load-balancer-controller.addons.k8s.io/k8s-1.19.yaml
    manifestHash: d81c123c3052593cf7e408b201be502f880eef6239bebf2fe5b99020e2d40a30
    name: aws-load-balancer-controller.addons.k8s.io
    needsPKI: true
    selector:
      k8s-addon: aws-load-balancer-controller.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.16
    manifest: eks-pod-identity-webhook.addons.k8s.io/k8s-1.16.yaml
    manifestHash: fa5d3a18e82ae6b2c9adc33ca25bcd03525e14605643524ca20ffdb40c979e12
    name: eks-pod-identity-webhook.addons.k8s.io
    needsPKI: true
    selector:
      k8s-addon: eks-pod-identity-webhook.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.25
    manifest: networking.projectcalico.org/k8s-1.25.yaml
    manifestHash: 1dc0924e64fe48eb95e584ab789f797e8a004fc87ccdd2aed5a2cdb64e72cde4
    name: networking.projectcalico.org
    prune:
      kinds:
      - kind: ConfigMap
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - kind: Service
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - kind: ServiceAccount
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: admissionregistration.k8s.io
        kind: MutatingWebhookConfiguration
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: admissionregistration.k8s.io
        kind: ValidatingWebhookConfiguration
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: apps
        kind: DaemonSet
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: apps
        kind: Deployment
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: apps
        kind: StatefulSet
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: policy
        kind: PodDisruptionBudget
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
        namespaces:
        - kube-system
      - group: rbac.authorization.k8s.io
        kind: ClusterRole
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: ClusterRoleBinding
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: Role
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
      - group: rbac.authorization.k8s.io
        kind: RoleBinding
        labelSelector: addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops
    selector:
      role.kubernetes.io/networking: "1"
    version: 9.99.0
  - id: k8s-1.18
    manifest: aws-cloud-controller.addons.k8s.io/k8s-1.18.yaml
    manifestHash: c2110c35e17e34803a65b5a86fcc766f3a1cb0e648e098a6e6fc98e56c5c7829
    name: aws-cloud-controller.addons.k8s.io
    selector:
      k8s-addon: aws-cloud-controller.addons.k8s.io
    version: 9.99.0
  - id: k8s-1.20
    manifest: snapshot-controller.addons.k8s.io/k8s-1.20.yaml
    manifestHash: ffd0583173236566c0315a3dcad5bb5ee23f43a0eedd83b7049be7dea4099110
    name: snapshot-controller.addons.k8s.io
    needsPKI: true
    selector:
      k8s-addon: snapshot-controller.addons.k8s.io
    version: 9.99.0
hakman commented 8 months ago

Hi @ScOut3R. Are there more detailed logs in protokube? Would help to understand what the actual apply error is. Thanks!

salavessa commented 8 months ago

Could it be related to #121437? I've been facing issues with kubernetes 1.27.7 because kubectl binary seems to be unavailable from at least one CDN location.

hakman commented 8 months ago

Could it be related to #121437? I've been facing issues with kubernetes 1.27.7 because kubectl binary seems to be unavailable from at least one CDN location.

Very unlikely.

ScOut3R commented 8 months ago

Hi @ScOut3R. Are there more detailed logs in protokube? Would help to understand what the actual apply error is. Thanks!

Hi @hakman, here's the snippet of what's looping on all three master nodes:

protokube[2171]: I1023 11:07:52.797209    2171 channels.go:31] checking channel: "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml"
protokube[2171]: I1023 11:07:52.797254    2171 channels.go:45] Running command: /opt/kops/bin/channels apply channel s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml --v=4 --yes
protokube[2171]: I1023 11:07:57.285276    2171 channels.go:48] error running /opt/kops/bin/channels apply channel s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml --v=4 --yes:
protokube[2171]: I1023 11:07:57.285308    2171 channels.go:49] I1023 11:07:52.840974    2407 addons.go:38] Loading addons channel from "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml"
protokube[2171]: I1023 11:07:52.899725    2407 s3context.go:211] found bucket in region "ap-southeast-2"
protokube[2171]: I1023 11:07:52.899750    2407 s3fs.go:320] Reading file "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml"
protokube[2171]: I1023 11:07:53.034134    2407 channel_version.go:141] manifest Match for "node-termination-handler.aws": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.11 ManifestHash=463f0bd528db7ddccd0b2c13ac4ae054af0017abb80728f77e2b50a47253c743 SystemGeneration=1
protokube[2171]: I1023 11:07:53.034166    2407 channel_version.go:141] manifest Match for "cluster-autoscaler.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.15 ManifestHash=a447e25a6a3793b7d0205abcbe55a5410949f8152de7f2656376b0decc04998f SystemGeneration=1
protokube[2171]: I1023 11:07:53.034174    2407 channel_version.go:141] manifest Match for "coredns.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=9e04bbb03f9e5e62e8a246a96f73e8aaa31bf1d3284346c335774df5f7c3495d SystemGeneration=1
protokube[2171]: I1023 11:07:53.034180    2407 channel_version.go:122] cluster has different ids for "networking.projectcalico.org" ("k8s-1.25" vs "k8s-1.22"); will replace
protokube[2171]: I1023 11:07:53.034187    2407 channel_version.go:141] manifest Match for "kops-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.16 ManifestHash=ff67e831a385c35e288a7794a3a370264742d9f7c32a16c139acb6db52db1997 SystemGeneration=1
protokube[2171]: I1023 11:07:53.034192    2407 channel_version.go:141] manifest Match for "limit-range.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml ManifestHash=2d55c3bc5e354e84a3730a65b42f39aba630a59dc8d32b30859fcce3d3178bc2 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041319    2407 channel_version.go:141] manifest Match for "aws-load-balancer-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.19 ManifestHash=d81c123c3052593cf7e408b201be502f880eef6239bebf2fe5b99020e2d40a30 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041349    2407 channel_version.go:141] manifest Match for "kubelet-api.rbac.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.9 ManifestHash=01c120e887bd98d82ef57983ad58a0b22bc85efb48108092a24c4b82e4c9ea81 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041356    2407 channel_version.go:141] manifest Match for "nodelocaldns.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=7af8058a8032e1f46dffeecd0f1539d628bf525cadb17a9e375c0c27176c441c SystemGeneration=1
protokube[2171]: I1023 11:07:53.041362    2407 channel_version.go:141] manifest Match for "node-problem-detector.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.17 ManifestHash=baae1afe4f61e36b96e7d77bd76c9c33c7c0de281c21fea1d36949ca731dd533 SystemGeneration=1
protokube[2171]: I1023 11:07:53.047488    2407 channel_version.go:141] manifest Match for "eks-pod-identity-webhook.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.16 ManifestHash=fa5d3a18e82ae6b2c9adc33ca25bcd03525e14605643524ca20ffdb40c979e12 SystemGeneration=1
protokube[2171]: I1023 11:07:53.047804    2407 channel_version.go:141] manifest Match for "dns-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=c2abd7a216533f237e4c425a8890abc6822fa773ec7c52b3fc434b3e394619c6 SystemGeneration=1
protokube[2171]: I1023 11:07:53.054369    2407 channel_version.go:141] manifest Match for "snapshot-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.20 ManifestHash=ffd0583173236566c0315a3dcad5bb5ee23f43a0eedd83b7049be7dea4099110 SystemGeneration=1
protokube[2171]: I1023 11:07:53.054395    2407 channel_version.go:141] manifest Match for "aws-cloud-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.18 ManifestHash=c2110c35e17e34803a65b5a86fcc766f3a1cb0e648e098a6e6fc98e56c5c7829 SystemGeneration=1
protokube[2171]: I1023 11:07:53.060264    2407 channel_version.go:141] manifest Match for "metrics-server.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.11 ManifestHash=ad25beb6dd7f245793d022efb0d142ff72744cd28c1ff0c17e064db4379c9382 SystemGeneration=1
protokube[2171]: NAME                                CURRENT                                                                        UPDATE                                                                        PKI
protokube[2171]: networking.projectcalico.org        484cbb72e961133d5866546f8ce5adb64a39e6a9dfe5e368f275bf8babf82f61        1dc0924e64fe48eb95e584ab789f797e8a004fc87ccdd2aed5a2cdb64e72cde4        no
protokube[2171]: I1023 11:07:53.060340    2407 channel_version.go:122] cluster has different ids for "networking.projectcalico.org" ("k8s-1.25" vs "k8s-1.22"); will replace
protokube[2171]: I1023 11:07:53.060355    2407 addon.go:192] Applying update from "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml"
protokube[2171]: I1023 11:07:53.060378    2407 s3fs.go:320] Reading file "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml"
protokube[2171]: I1023 11:07:53.118934    2407 cached_discovery.go:88] skipped caching discovery info, no resources found
protokube[2171]: I1023 11:07:53.478373    2407 health.go:61] status conditions not found for Service:kube-system/calico-typha
protokube[2171]: W1023 11:07:53.507989    2407 results.go:63] error from apply on apps/v1, Kind=DaemonSet kube-system/calico-node: error from apply: error patching object: DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value
protokube[2171]: W1023 11:07:53.632497    2407 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:30, applySuccessCount:29, applyFailCount:1, healthyCount:29, unhealthyCount:0}
protokube[2171]: I1023 11:07:53.632532    2407 prune.go:40] Prune spec: &{[{ ConfigMap [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } { Service [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } { ServiceAccount [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {admissionregistration.k8s.io MutatingWebhookConfiguration [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {admissionregistration.k8s.io ValidatingWebhookConfiguration [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps DaemonSet [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps Deployment [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps StatefulSet [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {policy PodDisruptionBudget [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io ClusterRole [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io ClusterRoleBinding [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io Role [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io RoleBinding [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops }]}
protokube[2171]: I1023 11:07:53.656358    2407 prune.go:79] pruning objects of kind: ConfigMap
protokube[2171]: I1023 11:07:53.676388    2407 prune.go:79] pruning objects of kind: Service
protokube[2171]: I1023 11:07:53.726022    2407 prune.go:79] pruning objects of kind: ServiceAccount
protokube[2171]: I1023 11:07:53.776030    2407 prune.go:79] pruning objects of kind: MutatingWebhookConfiguration.admissionregistration.k8s.io
protokube[2171]: I1023 11:07:53.825859    2407 prune.go:79] pruning objects of kind: ValidatingWebhookConfiguration.admissionregistration.k8s.io
protokube[2171]: I1023 11:07:53.875951    2407 prune.go:79] pruning objects of kind: DaemonSet.apps
protokube[2171]: I1023 11:07:53.928895    2407 prune.go:79] pruning objects of kind: Deployment.apps
protokube[2171]: I1023 11:07:53.977481    2407 prune.go:79] pruning objects of kind: StatefulSet.apps
protokube[2171]: I1023 11:07:54.046801    2407 prune.go:79] pruning objects of kind: PodDisruptionBudget.policy
protokube[2171]: I1023 11:07:54.075601    2407 prune.go:79] pruning objects of kind: ClusterRole.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.130710    2407 prune.go:79] pruning objects of kind: ClusterRoleBinding.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.178866    2407 prune.go:79] pruning objects of kind: Role.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.236918    2407 prune.go:79] pruning objects of kind: RoleBinding.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:56.977569    2407 health.go:61] status conditions not found for Service:kube-system/calico-typha
protokube[2171]: W1023 11:07:57.095323    2407 results.go:63] error from apply on apps/v1, Kind=DaemonSet kube-system/calico-node: error from apply: error patching object: DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value
protokube[2171]: W1023 11:07:57.281744    2407 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:30, applySuccessCount:29, applyFailCount:1, healthyCount:29, unhealthyCount:0}
protokube[2171]: updating "networking.projectcalico.org": error updating addon from "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml": error applying update: not all objects were applied; error applying update after prune: not all objects were applied
protokube[2171]: I1023 11:07:57.285328    2171 channels.go:34] apply channel output was: I1023 11:07:52.840974    2407 addons.go:38] Loading addons channel from "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml"
protokube[2171]: I1023 11:07:52.899725    2407 s3context.go:211] found bucket in region "ap-southeast-2"
protokube[2171]: I1023 11:07:52.899750    2407 s3fs.go:320] Reading file "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml"
protokube[2171]: I1023 11:07:53.034134    2407 channel_version.go:141] manifest Match for "node-termination-handler.aws": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.11 ManifestHash=463f0bd528db7ddccd0b2c13ac4ae054af0017abb80728f77e2b50a47253c743 SystemGeneration=1
protokube[2171]: I1023 11:07:53.034166    2407 channel_version.go:141] manifest Match for "cluster-autoscaler.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.15 ManifestHash=a447e25a6a3793b7d0205abcbe55a5410949f8152de7f2656376b0decc04998f SystemGeneration=1
protokube[2171]: I1023 11:07:53.034174    2407 channel_version.go:141] manifest Match for "coredns.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=9e04bbb03f9e5e62e8a246a96f73e8aaa31bf1d3284346c335774df5f7c3495d SystemGeneration=1
protokube[2171]: I1023 11:07:53.034180    2407 channel_version.go:122] cluster has different ids for "networking.projectcalico.org" ("k8s-1.25" vs "k8s-1.22"); will replace
protokube[2171]: I1023 11:07:53.034187    2407 channel_version.go:141] manifest Match for "kops-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.16 ManifestHash=ff67e831a385c35e288a7794a3a370264742d9f7c32a16c139acb6db52db1997 SystemGeneration=1
protokube[2171]: I1023 11:07:53.034192    2407 channel_version.go:141] manifest Match for "limit-range.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml ManifestHash=2d55c3bc5e354e84a3730a65b42f39aba630a59dc8d32b30859fcce3d3178bc2 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041319    2407 channel_version.go:141] manifest Match for "aws-load-balancer-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.19 ManifestHash=d81c123c3052593cf7e408b201be502f880eef6239bebf2fe5b99020e2d40a30 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041349    2407 channel_version.go:141] manifest Match for "kubelet-api.rbac.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.9 ManifestHash=01c120e887bd98d82ef57983ad58a0b22bc85efb48108092a24c4b82e4c9ea81 SystemGeneration=1
protokube[2171]: I1023 11:07:53.041356    2407 channel_version.go:141] manifest Match for "nodelocaldns.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=7af8058a8032e1f46dffeecd0f1539d628bf525cadb17a9e375c0c27176c441c SystemGeneration=1
protokube[2171]: I1023 11:07:53.041362    2407 channel_version.go:141] manifest Match for "node-problem-detector.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.17 ManifestHash=baae1afe4f61e36b96e7d77bd76c9c33c7c0de281c21fea1d36949ca731dd533 SystemGeneration=1
protokube[2171]: I1023 11:07:53.047488    2407 channel_version.go:141] manifest Match for "eks-pod-identity-webhook.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.16 ManifestHash=fa5d3a18e82ae6b2c9adc33ca25bcd03525e14605643524ca20ffdb40c979e12 SystemGeneration=1
protokube[2171]: I1023 11:07:53.047804    2407 channel_version.go:141] manifest Match for "dns-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.12 ManifestHash=c2abd7a216533f237e4c425a8890abc6822fa773ec7c52b3fc434b3e394619c6 SystemGeneration=1
protokube[2171]: I1023 11:07:53.054369    2407 channel_version.go:141] manifest Match for "snapshot-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.20 ManifestHash=ffd0583173236566c0315a3dcad5bb5ee23f43a0eedd83b7049be7dea4099110 SystemGeneration=1
protokube[2171]: I1023 11:07:53.054395    2407 channel_version.go:141] manifest Match for "aws-cloud-controller.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.18 ManifestHash=c2110c35e17e34803a65b5a86fcc766f3a1cb0e648e098a6e6fc98e56c5c7829 SystemGeneration=1
protokube[2171]: I1023 11:07:53.060264    2407 channel_version.go:141] manifest Match for "metrics-server.addons.k8s.io": Channel=s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml Id=k8s-1.11 ManifestHash=ad25beb6dd7f245793d022efb0d142ff72744cd28c1ff0c17e064db4379c9382 SystemGeneration=1
protokube[2171]: NAME                                CURRENT                                                                        UPDATE                                                                        PKI
protokube[2171]: networking.projectcalico.org        484cbb72e961133d5866546f8ce5adb64a39e6a9dfe5e368f275bf8babf82f61        1dc0924e64fe48eb95e584ab789f797e8a004fc87ccdd2aed5a2cdb64e72cde4        no
protokube[2171]: I1023 11:07:53.060340    2407 channel_version.go:122] cluster has different ids for "networking.projectcalico.org" ("k8s-1.25" vs "k8s-1.22"); will replace
protokube[2171]: I1023 11:07:53.060355    2407 addon.go:192] Applying update from "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml"
protokube[2171]: I1023 11:07:53.060378    2407 s3fs.go:320] Reading file "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml"
protokube[2171]: I1023 11:07:53.118934    2407 cached_discovery.go:88] skipped caching discovery info, no resources found
protokube[2171]: I1023 11:07:53.478373    2407 health.go:61] status conditions not found for Service:kube-system/calico-typha
protokube[2171]: W1023 11:07:53.507989    2407 results.go:63] error from apply on apps/v1, Kind=DaemonSet kube-system/calico-node: error from apply: error patching object: DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value
protokube[2171]: W1023 11:07:53.632497    2407 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:30, applySuccessCount:29, applyFailCount:1, healthyCount:29, unhealthyCount:0}
protokube[2171]: I1023 11:07:53.632532    2407 prune.go:40] Prune spec: &{[{ ConfigMap [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } { Service [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } { ServiceAccount [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {admissionregistration.k8s.io MutatingWebhookConfiguration [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {admissionregistration.k8s.io ValidatingWebhookConfiguration [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps DaemonSet [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps Deployment [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {apps StatefulSet [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {policy PodDisruptionBudget [kube-system] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io ClusterRole [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io ClusterRoleBinding [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io Role [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops } {rbac.authorization.k8s.io RoleBinding [] addon.kops.k8s.io/name=networking.projectcalico.org,app.kubernetes.io/managed-by=kops }]}
protokube[2171]: I1023 11:07:53.656358    2407 prune.go:79] pruning objects of kind: ConfigMap
protokube[2171]: I1023 11:07:53.676388    2407 prune.go:79] pruning objects of kind: Service
protokube[2171]: I1023 11:07:53.726022    2407 prune.go:79] pruning objects of kind: ServiceAccount
protokube[2171]: I1023 11:07:53.776030    2407 prune.go:79] pruning objects of kind: MutatingWebhookConfiguration.admissionregistration.k8s.io
protokube[2171]: I1023 11:07:53.825859    2407 prune.go:79] pruning objects of kind: ValidatingWebhookConfiguration.admissionregistration.k8s.io
protokube[2171]: I1023 11:07:53.875951    2407 prune.go:79] pruning objects of kind: DaemonSet.apps
protokube[2171]: I1023 11:07:53.928895    2407 prune.go:79] pruning objects of kind: Deployment.apps
protokube[2171]: I1023 11:07:53.977481    2407 prune.go:79] pruning objects of kind: StatefulSet.apps
protokube[2171]: I1023 11:07:54.046801    2407 prune.go:79] pruning objects of kind: PodDisruptionBudget.policy
protokube[2171]: I1023 11:07:54.075601    2407 prune.go:79] pruning objects of kind: ClusterRole.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.130710    2407 prune.go:79] pruning objects of kind: ClusterRoleBinding.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.178866    2407 prune.go:79] pruning objects of kind: Role.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:54.236918    2407 prune.go:79] pruning objects of kind: RoleBinding.rbac.authorization.k8s.io
protokube[2171]: I1023 11:07:56.977569    2407 health.go:61] status conditions not found for Service:kube-system/calico-typha
protokube[2171]: W1023 11:07:57.095323    2407 results.go:63] error from apply on apps/v1, Kind=DaemonSet kube-system/calico-node: error from apply: error patching object: DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value
protokube[2171]: W1023 11:07:57.281744    2407 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:30, applySuccessCount:29, applyFailCount:1, healthyCount:29, unhealthyCount:0}
protokube[2171]: updating "networking.projectcalico.org": error updating addon from "s3://[bucket name]/[cluster name]/addons/networking.projectcalico.org/k8s-1.25.yaml": error applying update: not all objects were applied; error applying update after prune: not all objects were applied
protokube[2171]: W1023 11:07:57.285340    2171 kube_boot.go:89] error applying channel "s3://[bucket name]/[cluster name]/addons/bootstrap-channel.yaml": error running channels: exit status 1
hakman commented 8 months ago

DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value

@ScOut3R Could you check the calico manifest to see what is the Init Container that is generating this error?

ScOut3R commented 8 months ago

DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value

@ScOut3R Could you check the calico manifest to see what is the Init Container that is generating this error?

This is weird @hakman. Only the k8s-1.22.yaml manifest has 4 init containers for the calico-node DaemonSet. The latest, k8s-1.25.yaml has only 3. Is it possible that protokube is still stuck on applying a previous calico configuration?

Regardless, in k8s-1.22.yaml the 4th init container (at index [3]) seems to have an image property defined:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  creationTimestamp: null
  labels:
    addon.kops.k8s.io/name: networking.projectcalico.org
    app.kubernetes.io/managed-by: kops
    k8s-app: calico-node
    role.kubernetes.io/networking: "1"
  name: calico-node
  namespace: kube-system
spec:
  selector:
    matchLabels:
      k8s-app: calico-node
  template:
    metadata:
      creationTimestamp: null
      labels:
        k8s-app: calico-node
        kops.k8s.io/managed-by: kops
    spec:
      containers:
      - env:
        - name: DATASTORE_TYPE
          value: kubernetes
        - name: FELIX_TYPHAK8SSERVICENAME
          valueFrom:
            configMapKeyRef:
              key: typha_service_name
              name: calico-config
        - name: WAIT_FOR_DATASTORE
          value: "true"
        - name: NODENAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CALICO_NETWORKING_BACKEND
          valueFrom:
            configMapKeyRef:
              key: calico_backend
              name: calico-config
        - name: CLUSTER_TYPE
          value: kops,bgp
        - name: IP
          value: autodetect
        - name: IP6
          value: none
        - name: IP_AUTODETECTION_METHOD
          value: first-found
        - name: IP6_AUTODETECTION_METHOD
          value: none
        - name: CALICO_IPV4POOL_IPIP
          value: Never
        - name: CALICO_IPV4POOL_VXLAN
          value: CrossSubnet
        - name: FELIX_IPINIPMTU
          valueFrom:
            configMapKeyRef:
              key: veth_mtu
              name: calico-config
        - name: FELIX_VXLANMTU
          valueFrom:
            configMapKeyRef:
              key: veth_mtu
              name: calico-config
        - name: FELIX_WIREGUARDMTU
          valueFrom:
            configMapKeyRef:
              key: veth_mtu
              name: calico-config
        - name: CALICO_IPV4POOL_CIDR
          value: 100.96.0.0/11
        - name: CALICO_DISABLE_FILE_LOGGING
          value: "true"
        - name: FELIX_DEFAULTENDPOINTTOHOSTACTION
          value: ACCEPT
        - name: FELIX_IPV6SUPPORT
          value: "false"
        - name: FELIX_HEALTHENABLED
          value: "true"
        - name: FELIX_AWSSRCDSTCHECK
          value: Disable
        - name: FELIX_BPFENABLED
          value: "false"
        - name: FELIX_BPFEXTERNALSERVICEMODE
          value: Tunnel
        - name: FELIX_BPFKUBEPROXYIPTABLESCLEANUPENABLED
          value: "false"
        - name: FELIX_BPFLOGLEVEL
          value: "Off"
        - name: FELIX_CHAININSERTMODE
          value: insert
        - name: FELIX_IPTABLESBACKEND
          value: Auto
        - name: FELIX_LOGSEVERITYSCREEN
          value: info
        - name: FELIX_PROMETHEUSMETRICSENABLED
          value: "false"
        - name: FELIX_PROMETHEUSMETRICSPORT
          value: "9091"
        - name: FELIX_PROMETHEUSGOMETRICSENABLED
          value: "false"
        - name: FELIX_PROMETHEUSPROCESSMETRICSENABLED
          value: "false"
        - name: FELIX_WIREGUARDENABLED
          value: "false"
        envFrom:
        - configMapRef:
            name: kubernetes-services-endpoint
            optional: true
        image: docker.io/calico/node:v3.23.5@sha256:b7f4f7a0ce463de5d294fdf2bb13f61035ec6e3e5ee05dd61dcc8e79bc29d934
        lifecycle:
          preStop:
            exec:
              command:
              - /bin/calico-node
              - -shutdown
        livenessProbe:
          exec:
            command:
            - /bin/calico-node
            - -felix-live
          failureThreshold: 6
          initialDelaySeconds: 10
          periodSeconds: 10
          timeoutSeconds: 10
        name: calico-node
        readinessProbe:
          exec:
            command:
            - /bin/calico-node
            - -felix-ready
          periodSeconds: 10
          timeoutSeconds: 10
        resources:
          requests:
            cpu: 100m
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /host/etc/cni/net.d
          name: cni-net-dir
          readOnly: false
        - mountPath: /lib/modules
          name: lib-modules
          readOnly: true
        - mountPath: /run/xtables.lock
          name: xtables-lock
          readOnly: false
        - mountPath: /var/run/calico
          name: var-run-calico
          readOnly: false
        - mountPath: /var/lib/calico
          name: var-lib-calico
          readOnly: false
        - mountPath: /var/run/nodeagent
          name: policysync
        - mountPath: /sys/fs/bpf
          name: bpffs
        - mountPath: /var/log/calico/cni
          name: cni-log-dir
          readOnly: true
      hostNetwork: true
      initContainers:
      - command:
        - /opt/cni/bin/calico-ipam
        - -upgrade
        env:
        - name: KUBERNETES_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CALICO_NETWORKING_BACKEND
          valueFrom:
            configMapKeyRef:
              key: calico_backend
              name: calico-config
        envFrom:
        - configMapRef:
            name: kubernetes-services-endpoint
            optional: true
        image: docker.io/calico/cni:v3.23.5@sha256:7ca5c455cff6c0d661e33918d95a1133afb450411dbfb7e4369a9ecf5e0212dc
        name: upgrade-ipam
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /var/lib/cni/networks
          name: host-local-net-dir
        - mountPath: /host/opt/cni/bin
          name: cni-bin-dir
      - command:
        - /opt/cni/bin/install
        env:
        - name: CNI_CONF_NAME
          value: 10-calico.conflist
        - name: CNI_NETWORK_CONFIG
          valueFrom:
            configMapKeyRef:
              key: cni_network_config
              name: calico-config
        - name: KUBERNETES_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: CNI_MTU
          valueFrom:
            configMapKeyRef:
              key: veth_mtu
              name: calico-config
        - name: SLEEP
          value: "false"
        envFrom:
        - configMapRef:
            name: kubernetes-services-endpoint
            optional: true
        image: docker.io/calico/cni:v3.23.5@sha256:7ca5c455cff6c0d661e33918d95a1133afb450411dbfb7e4369a9ecf5e0212dc
        name: install-cni
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /host/opt/cni/bin
          name: cni-bin-dir
        - mountPath: /host/etc/cni/net.d
          name: cni-net-dir
      - command:
        - calico-node
        - -init
        - -best-effort
        image: docker.io/calico/node:v3.23.5@sha256:b7f4f7a0ce463de5d294fdf2bb13f61035ec6e3e5ee05dd61dcc8e79bc29d934
        name: mount-bpffs
        securityContext:
          privileged: true
        volumeMounts:
        - mountPath: /sys/fs
          mountPropagation: Bidirectional
          name: sys-fs
        - mountPath: /var/run/calico
          mountPropagation: Bidirectional
          name: var-run-calico
        - mountPath: /nodeproc
          name: nodeproc
          readOnly: true
      - command:
        - sh
        - -c
        - echo Temporary fix to avoid server side apply issues
        image: busybox@sha256:c118f538365369207c12e5794c3cbfb7b042d950af590ae6c287ede74f29b7d4
        name: flexvol-driver
      nodeSelector:
        kubernetes.io/os: linux
      priorityClassName: system-node-critical
      serviceAccountName: calico-node
      terminationGracePeriodSeconds: 0
      tolerations:
      - effect: NoSchedule
        operator: Exists
      - key: CriticalAddonsOnly
        operator: Exists
      - effect: NoExecute
        operator: Exists
      volumes:
      - hostPath:
          path: /lib/modules
        name: lib-modules
      - hostPath:
          path: /var/run/calico
        name: var-run-calico
      - hostPath:
          path: /var/lib/calico
        name: var-lib-calico
      - hostPath:
          path: /run/xtables.lock
          type: FileOrCreate
        name: xtables-lock
      - hostPath:
          path: /sys/fs/
          type: DirectoryOrCreate
        name: sys-fs
      - hostPath:
          path: /sys/fs/bpf
          type: Directory
        name: bpffs
      - hostPath:
          path: /proc
        name: nodeproc
      - hostPath:
          path: /opt/cni/bin
        name: cni-bin-dir
      - hostPath:
          path: /etc/cni/net.d
        name: cni-net-dir
      - hostPath:
          path: /var/log/calico/cni
        name: cni-log-dir
      - hostPath:
          path: /var/lib/cni/networks
        name: host-local-net-dir
      - hostPath:
          path: /var/run/nodeagent
          type: DirectoryOrCreate
        name: policysync
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 1
    type: RollingUpdate
ScOut3R commented 7 months ago

@hakman I don't want to celebrate prematurely, but it looks like updating the cluster's configuration using the latest kops release solved this problem. I have another cluster with the same symptoms, I will do the update sometime next week to confirm.

k8s-triage-robot commented 4 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

avdhoot commented 4 months ago

We are facing a similar issue with kops version 1.28.1. Also k8s-1.25.yaml have only 3 init containers.


Mar 11 21:17:11 i-dfdsfjdkfdffda protokube[2811]: W0311 21:17:11.513567   17945 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:27, applySuccessCount:26, applyFailCount:1, healthyCount:26, unhealthyCount:0}
Mar 11 21:17:11 i-dfdsfjdkfdffda protokube[2811]: updating "networking.projectcalico.org": error updating addon from "s3://foo-kops/eu-prod.kops.foo.bar/addons/networking.projectcalico.org/k8s-1.25.yaml": error applying update: not all objects were applied; error applying update after prune: not all objects were applied```
hakman commented 4 months ago

@avdhoot Could you share the rest of the error? There should be something regarding what exactly failed. Thanks!

avdhoot commented 3 months ago

@hakman Hope this help

Mar 12 15:47:21 instanceid protokube[2728]: W0312 15:47:18.605822 3256739 results.go:63] error from apply on policy/v1, Kind=PodDisruptionBudget kube-system/calico-kube-controllers: failed to create managed-fields patch: cannot merge ManagedFieldsEntry apiVersion "policy/v1beta1" with apiVersion "policy/v1"
Mar 12 15:47:21 instanceid protokube[2728]: W0312 15:47:21.123748 3256739 results.go:63] error from apply on apps/v1, Kind=DaemonSet kube-system/calico-node: error from apply: error patching object: DaemonSet.apps "calico-node" is invalid: spec.template.spec.initContainers[3].image: Required value
Mar 12 15:47:21 instanceid protokube[2728]: W0312 15:47:21.210560 3256739 results.go:56] consistency error (healthy counts): &applyset.ApplyResults{total:27, applySuccessCount:25, applyFailCount:2, healthyCount:25, unhealthyCount:0}
Mar 12 15:47:21 instanceid protokube[2728]: updating "networking.projectcalico.org": error updating addon from "s3://foo/addons/networking.projectcalico.org/k8s-1.25.yaml": error applying update: not all objects were applied; error applying update after prune: not all objects were applied
Mar 12 15:47:21 instanceid protokube[2728]: I0312 15:47:21.212757    2728 channels.go:34] apply channel output was: I0312 15:47:17.524366 3256739 addons.go:38] Loading addons channel from "s3://foo/addons/bootstrap-channel.yaml"
Mar 12 15:47:21 instanceid protokube[2728]: I0312 15:47:17.594448 3256739 s3context.go:211] found bucket in region "us-west-2"
Mar 12 15:47:21 instanceid protokube[2728]: I0312 15:47:17.594468 3256739 s3fs.go:371] Reading file "s3://foo/addons/bootstrap-channel.yaml"
hakman commented 3 months ago

/remove-lifecycle stale

hakman commented 3 months ago

/remove-lifecycle stale

k8s-triage-robot commented 4 weeks ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale