rancher / rke2

https://docs.rke2.io/
Apache License 2.0
1.54k stars 266 forks source link

[rke2-whereabouts] Cronjob failed #3603

Closed rmammadli closed 1 year ago

rmammadli commented 1 year ago

Environmental Info: RKE2 Version: v1.24.4+rke2r1

Node(s) CPU architecture, OS, and Version: Linux 5.4.0-132-generic 148-Ubuntu SMP Mon Oct 17 16:02:06 UTC 2022 x86_64 Linux

Cluster Configuration: 3 servers, 3 agents

Describe the bug: Deploying whereabouts cni plugin with multus resulting in cronjob failed.

Steps To Reproduce: Deploy multus with whereabouts cni plugin.

HelmChartConfig

---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-multus
  namespace: kube-system
spec:
  valuesContent: |-
    rke2-whereabouts:
      enabled: true

Expected behavior: Whereabouts deployment runs without error.

Actual behavior: Whereabouts cronjob runs with following error -> failed: unable to start container process: exec: "/ip-reconciler" :

> kubectl get events -n kube-system --sort-by .metadata.creationTimestamp

Output:


4m19s       Normal    SuccessfulCreate       job/rke2-multus-rke2-whereabouts-27830225         Created pod: rke2-multus-rke2-whereabouts-27830225-wnjcq
4m19s       Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-27830225-wnjcq   Successfully assigned kube-system/rke2-multus-rke2-whereabouts-27830225-wnjcq to workload-cluster-test-03-wrk01-v20221128-530f8657kltg
4m18s       Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-27830225-wnjcq   Container image "ghcr.io/k8snetworkplumbingwg/whereabouts:latest-amd64" already present on machine
4m18s       Normal    Created                pod/rke2-multus-rke2-whereabouts-27830225-wnjcq   Created container rke2-whereabouts
4m17s       Warning   Failed                 pod/rke2-multus-rke2-whereabouts-27830225-wnjcq   Error: failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "/ip-reconciler": stat /ip-reconciler: no such file or directory: unknown
4m16s       Warning   BackOff                pod/rke2-multus-rke2-whereabouts-27830225-wnjcq   Back-off restarting failed container
4m16s       Normal    SuccessfulDelete       job/rke2-multus-rke2-whereabouts-27830225         Deleted pod: rke2-multus-rke2-whereabouts-27830225-wnjcq
4m15s       Warning   BackoffLimitExceeded   job/rke2-multus-rke2-whereabouts-27830225         Job has reached the specified backoff limit```
thomasferrandiz commented 1 year ago

Hi, could you check the configuration used to deploy whereabouts? It looks like you are using image ghcr.io/k8snetworkplumbingwg/whereabouts:latest-amd64 instead of the default which is rancher/hardened-whereabouts:v0.5.3-build20221027.

The latest image was published 12 days ago and is not part of an official upstream release so it is not yet supported by rke2.

rmammadli commented 1 year ago

Hi @thomasferrandiz, thank you very much for your prompt reply! I had failed cronjob also with default image. Just deployed cni plugin using default image from rancher:

3m32s       Normal    AppliedManifest        addon/rke2-etcd-snapshot-extra-metadata           Applied manifest at "/var/lib/rancher/rke2/server/manifests/rancher/rke2-etcd-snapshot-extra-metadata.yaml"
3m32s       Normal    ApplyingManifest       addon/managed-chart-config                        Applying manifest at "/var/lib/rancher/rke2/server/manifests/rancher/managed-chart-config.yaml"
3m32s       Normal    AppliedManifest        addon/managed-chart-config                        Applied manifest at "/var/lib/rancher/rke2/server/manifests/rancher/managed-chart-config.yaml"
3m32s       Normal    ApplyingManifest       addon/rke2-etcd-snapshot-extra-metadata           Applying manifest at "/var/lib/rancher/rke2/server/manifests/rancher/rke2-etcd-snapshot-extra-metadata.yaml"
3m31s       Normal    SuccessfulCreate       job/helm-install-rke2-multus                      Created pod: helm-install-rke2-multus-nfzwx
3m30s       Normal    Scheduled              pod/helm-install-rke2-multus-nfzwx                Successfully assigned kube-system/helm-install-rke2-multus-nfzwx to workload-cluster-test-03-cp03-v20221128-355e1310lkbmn
3m20s       Normal    ApplyJob               helmchart/rke2-multus                             Applying HelmChart using Job kube-system/helm-install-rke2-multus
3m30s       Normal    Pulling                pod/helm-install-rke2-multus-nfzwx                Pulling image "rancher/klipper-helm:v0.7.3-build20220613"
3m26s       Normal    Started                pod/helm-install-rke2-multus-nfzwx                Started container helm
3m26s       Normal    Created                pod/helm-install-rke2-multus-nfzwx                Created container helm
3m26s       Normal    Pulled                 pod/helm-install-rke2-multus-nfzwx                Successfully pulled image "rancher/klipper-helm:v0.7.3-build20220613" in 4.10986931s
3m24s       Normal    Killing                pod/rke2-multus-rke2-whereabouts-gmtlj            Stopping container rke2-whereabouts
3m24s       Normal    SuccessfulDelete       daemonset/rke2-multus-rke2-whereabouts            Deleted pod: rke2-multus-rke2-whereabouts-gmtlj
3m20s       Normal    Completed              job/helm-install-rke2-multus                      Job completed
3m9s        Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Successfully assigned kube-system/rke2-multus-rke2-whereabouts-27830260-bl7c5 to workload-cluster-test-03-wrk01-v20221128-530f8657kltg
3m9s        Normal    AddedInterface         pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Add eth0 [10.42.3.72/32] from cilium
3m9s        Normal    SuccessfulCreate       job/rke2-multus-rke2-whereabouts-27830260         Created pod: rke2-multus-rke2-whereabouts-27830260-bl7c5
3m8s        Normal    Pulling                pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Pulling image "rancher/hardened-whereabouts:v0.5.3-build20220610"
3m2s        Normal    Started                pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Started container rke2-whereabouts
3m2s        Normal    Created                pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Created container rke2-whereabouts
3m4s        Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Successfully pulled image "rancher/hardened-whereabouts:v0.5.3-build20220610" in 3.958716825s
3m2s        Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Container image "rancher/hardened-whereabouts:v0.5.3-build20220610" already present on machine
3m          Warning   BackoffLimitExceeded   job/rke2-multus-rke2-whereabouts-27830260         Job has reached the specified backoff limit
3m          Normal    SuccessfulDelete       job/rke2-multus-rke2-whereabouts-27830260         Deleted pod: rke2-multus-rke2-whereabouts-27830260-bl7c5
2m59s       Warning   BackOff                pod/rke2-multus-rke2-whereabouts-27830260-bl7c5   Back-off restarting failed container
2m53s       Normal    SuccessfulCreate       daemonset/rke2-multus-rke2-whereabouts            Created pod: rke2-multus-rke2-whereabouts-wjmkl
2m53s       Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-wjmkl            Successfully assigned kube-system/rke2-multus-rke2-whereabouts-wjmkl to workload-cluster-test-03-wrk02-v20221128-2f5ed147jnpp
2m53s       Normal    Pulling                pod/rke2-multus-rke2-whereabouts-wjmkl            Pulling image "rancher/hardened-whereabouts:v0.5.3-build20220610"
2m49s       Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-wjmkl            Successfully pulled image "rancher/hardened-whereabouts:v0.5.3-build20220610" in 3.670404532s
2m49s       Normal    Started                pod/rke2-multus-rke2-whereabouts-wjmkl            Started container rke2-whereabouts
2m49s       Normal    Created                pod/rke2-multus-rke2-whereabouts-wjmkl            Created container rke2-whereabouts
2m48s       Normal    SuccessfulDelete       daemonset/rke2-multus-rke2-whereabouts            Deleted pod: rke2-multus-rke2-whereabouts-rjgzq
2m48s       Normal    Killing                pod/rke2-multus-rke2-whereabouts-rjgzq            Stopping container rke2-whereabouts
2m17s       Normal    Created                pod/rke2-multus-rke2-whereabouts-5qfqq            Created container rke2-whereabouts
2m17s       Normal    SuccessfulCreate       daemonset/rke2-multus-rke2-whereabouts            Created pod: rke2-multus-rke2-whereabouts-5qfqq
2m17s       Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-5qfqq            Successfully assigned kube-system/rke2-multus-rke2-whereabouts-5qfqq to workload-cluster-test-03-wrk01-v20221128-530f8657kltg
2m17s       Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-5qfqq            Container image "rancher/hardened-whereabouts:v0.5.3-build20220610" already present on machine
2m16s       Normal    SuccessfulDelete       daemonset/rke2-multus-rke2-whereabouts            Deleted pod: rke2-multus-rke2-whereabouts-vk75j
2m16s       Normal    Killing                pod/rke2-multus-rke2-whereabouts-vk75j            Stopping container rke2-whereabouts
2m16s       Normal    Started                pod/rke2-multus-rke2-whereabouts-5qfqq            Started container rke2-whereabouts
105s        Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-nx8z9            Successfully assigned kube-system/rke2-multus-rke2-whereabouts-nx8z9 to workload-cluster-test-03-wrk03-v20221128-6344b004lg85
105s        Normal    Pulling                pod/rke2-multus-rke2-whereabouts-nx8z9            Pulling image "rancher/hardened-whereabouts:v0.5.3-build20220610"
105s        Normal    SuccessfulCreate       daemonset/rke2-multus-rke2-whereabouts            Created pod: rke2-multus-rke2-whereabouts-nx8z9
101s        Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-nx8z9            Successfully pulled image "rancher/hardened-whereabouts:v0.5.3-build20220610" in 3.971709957s
101s        Normal    Created                pod/rke2-multus-rke2-whereabouts-nx8z9            Created container rke2-whereabouts
100s        Normal    Started                pod/rke2-multus-rke2-whereabouts-nx8z9            Started container rke2-whereabouts
rmammadli commented 1 year ago

actual status of related cronjob (failed) & job:

NAME                                         SCHEDULE      SUSPEND   ACTIVE   LAST SCHEDULE   AGE
cronjob.batch/rke2-multus-rke2-whereabouts   */5 * * * *   False     0        2m14s           4d18h

NAME                                              COMPLETIONS   DURATION   AGE
job.batch/helm-install-rke2-cilium                1/1           14s        4d20h
job.batch/helm-install-rke2-coredns               1/1           7s         72d
job.batch/helm-install-rke2-metrics-server        1/1           6s         72d
job.batch/helm-install-rke2-multus                1/1           11s        17m
job.batch/rke2-multus-rke2-whereabouts-27830275   0/1           2m14s      2m14s
thomasferrandiz commented 1 year ago

I did a test deployment and the job is failing for me as well. I will need to check it further to find the issue.

rmammadli commented 1 year ago

thanks again @thomasferrandiz for your assistance 👍

rancher-max commented 1 year ago

Validated on release-1.25 branch commit 4f1c4763e37f55bfe4ed93a78edd6b754fd75d0a

Environment Details

Infrastructure

Cluster Configuration:

1 server

Config:

# /etc/rancher/rke2/config.yaml
write-kubeconfig-mode: 644
cni: multus,calico

Additional Files:

# /var/lib/rancher/rke2/server/manifests/where.yaml
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-multus
  namespace: kube-system
spec:
  valuesContent: |-
    rke2-whereabouts:
      enabled: true

Testing Steps

  1. Ensure files are in their appropriate locations, then install and start rke2
  2. Validate all pods are up and running and using the correct versions: kubectl get nodes,pods -A -o wide && helm ls -A
  3. Validate whereabouts cronjob is successful: kubectl describe cronjob -n kube-system rke2-multus-rke2-whereabouts
  4. Validate images are using bcibase: sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock images, then sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock inspecti <multus-image-id> and sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock inspecti <whereabouts-image-id>

Validation Results:

NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES calico-system pod/calico-kube-controllers-77587d48b9-ntksn 1/1 Running 0 77m 10.42.35.131 ip-172-31-11-110 calico-system pod/calico-node-mblsd 1/1 Running 0 77m 172.31.11.110 ip-172-31-11-110 calico-system pod/calico-typha-65c6d77c4c-wlwdp 1/1 Running 0 77m 172.31.11.110 ip-172-31-11-110 kube-system pod/cloud-controller-manager-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/etcd-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/helm-install-rke2-calico-crd-7hxrm 0/1 Completed 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/helm-install-rke2-calico-hxs92 0/1 Completed 2 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/helm-install-rke2-coredns-grj49 0/1 Completed 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/helm-install-rke2-ingress-nginx-9rnnf 0/1 Completed 0 78m 10.42.35.130 ip-172-31-11-110 kube-system pod/helm-install-rke2-metrics-server-rv2pr 0/1 Completed 0 78m 10.42.35.132 ip-172-31-11-110 kube-system pod/helm-install-rke2-multus-msw6m 0/1 Completed 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/kube-apiserver-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/kube-controller-manager-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/kube-proxy-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/kube-scheduler-ip-172-31-11-110 1/1 Running 0 78m 172.31.11.110 ip-172-31-11-110 kube-system pod/rke2-coredns-rke2-coredns-776d5cfd89-8xdx7 1/1 Running 0 77m 10.42.35.133 ip-172-31-11-110 kube-system pod/rke2-coredns-rke2-coredns-autoscaler-6f964d8b7b-qddkd 1/1 Running 0 77m 10.42.35.134 ip-172-31-11-110 kube-system pod/rke2-ingress-nginx-controller-fnwbx 1/1 Running 0 76m 10.42.35.137 ip-172-31-11-110 kube-system pod/rke2-metrics-server-7d58bbc9c6-46gzb 1/1 Running 0 76m 10.42.35.136 ip-172-31-11-110 kube-system pod/rke2-multus-ds-zh84l 1/1 Running 0 77m 172.31.11.110 ip-172-31-11-110 kube-system pod/rke2-multus-rke2-whereabouts-fsmgw 1/1 Running 0 77m 172.31.11.110 ip-172-31-11-110 tigera-operator pod/tigera-operator-d57f69d64-xjjpd 1/1 Running 0 77m 172.31.11.110 ip-172-31-11-110

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION rke2-calico kube-system 1 2022-12-07 23:44:54.658996333 +0000 UTC deployed rke2-calico-v3.24.501 v3.24.5
rke2-calico-crd kube-system 1 2022-12-07 23:44:35.85592984 +0000 UTC deployed rke2-calico-crd-v3.24.501
rke2-coredns kube-system 1 2022-12-07 23:44:35.807382191 +0000 UTC deployed rke2-coredns-1.19.401 1.9.3
rke2-ingress-nginx kube-system 1 2022-12-07 23:45:35.014624009 +0000 UTC deployed rke2-ingress-nginx-4.1.007 1.2.0
rke2-metrics-server kube-system 1 2022-12-07 23:45:39.165355454 +0000 UTC deployed rke2-metrics-server-2.11.100-build2022101107 0.6.1
rke2-multus kube-system 1 2022-12-07 23:44:35.921658877 +0000 UTC deployed rke2-multus-v3.9-build2022102805 3.9

- whereabouts cronjob is successful:

$ k describe cronjob -n kube-system rke2-multus-rke2-whereabouts Name: rke2-multus-rke2-whereabouts Namespace: kube-system Labels: app=rke2-whereabouts app.kubernetes.io/instance=rke2-multus app.kubernetes.io/managed-by=Helm app.kubernetes.io/name=rke2-whereabouts app.kubernetes.io/version=0.5.3 helm.sh/chart=rke2-whereabouts-0.1.1 Annotations: meta.helm.sh/release-name: rke2-multus meta.helm.sh/release-namespace: kube-system Schedule: /5 * Concurrency Policy: Forbid Suspend: False Successful Job History Limit: 0 Failed Job History Limit: 1 Starting Deadline Seconds: Selector: Parallelism: Completions: Pod Template: Labels: app=rke2-whereabouts app.kubernetes.io/instance=rke2-multus app.kubernetes.io/name=rke2-whereabouts name=whereabouts Service Account: rke2-multus-rke2-whereabouts Containers: rke2-whereabouts: Image: rancher/hardened-whereabouts:v0.5.3-build20221027 Port: Host Port: Command: /ip-reconciler -log-level=verbose Limits: cpu: 100m memory: 50Mi Requests: cpu: 100m memory: 50Mi Environment: Mounts: /host/etc/cni/net.d from cni-net-dir (rw) Volumes: cni-net-dir: Type: HostPath (bare host directory volume) Path: /etc/cni/net.d HostPathType:
Priority Class Name: system-node-critical Last Schedule Time: Thu, 08 Dec 2022 01:00:00 +0000 Active Jobs: Events: Type Reason Age From Message


Normal SuccessfulCreate 59m cronjob-controller Created job rke2-multus-rke2-whereabouts-27840965 Normal SawCompletedJob 59m cronjob-controller Saw completed job: rke2-multus-rke2-whereabouts-27840965, status: Complete Normal SuccessfulDelete 59m cronjob-controller Deleted job rke2-multus-rke2-whereabouts-27840965 Normal SuccessfulCreate 54m cronjob-controller Created job rke2-multus-rke2-whereabouts-27840970 Normal SawCompletedJob 54m cronjob-controller Saw completed job: rke2-multus-rke2-whereabouts-27840970, status: Complete Normal SuccessfulDelete 54m cronjob-controller Deleted job rke2-multus-rke2-whereabouts-27840970 Normal SuccessfulCreate 49m cronjob-controller Created job rke2-multus-rke2-whereabouts-27840975 Normal SawCompletedJob 49m cronjob-controller Saw completed job: rke2-multus-rke2-whereabouts-27840975, status: Complete Normal SuccessfulDelete 49m cronjob-controller Deleted job rke2-multus-rke2-whereabouts-27840975 Normal SuccessfulCreate 44m cronjob-controller Created job rke2-multus-rke2-whereabouts-27840980 Normal SawCompletedJob 44m cronjob-controller Saw completed job: rke2-multus-rke2-whereabouts-27840980, status: Complete Normal SuccessfulDelete 44m cronjob-controller Deleted job rke2-multus-rke2-whereabouts-27840980 Normal MissingJob 44m cronjob-controller Active job went missing: rke2-multus-rke2-whereabouts-27840980 Normal SuccessfulCreate 39m cronjob-controller Created job rke2-multus-rke2-whereabouts-27840985 Normal SawCompletedJob 39m cronjob-controller Saw completed job: rke2-multus-rke2-whereabouts-27840985, status: Complete Normal SuccessfulDelete 39m cronjob-controller Deleted job rke2-multus-rke2-whereabouts-27840985 Normal SuccessfulCreate 34m cronjob-controller (combined from similar events): Created job rke2-multus-rke2-whereabouts-27840990 Normal SuccessfulDelete 34m cronjob-controller (combined from similar events): Deleted job rke2-multus-rke2-whereabouts-27840990 Normal MissingJob 34m cronjob-controller Active job went missing: rke2-multus-rke2-whereabouts-27840990 Normal SawCompletedJob 4m31s (x7 over 34m) cronjob-controller (combined from similar events): Saw completed job: rke2-multus-rke2-whereabouts-27841020, status: Complete

- Images are correctly using bcibase (previous images had no mention of sle-bci in their inspects):

$ sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock inspecti 6af0ddc997f11 | grep -i bci "com.suse.eula": "sle-bci", "com.suse.image-type": "sle-bci", "com.suse.sle.base.eula": "sle-bci", "com.suse.sle.base.image-type": "sle-bci",

$ sudo /var/lib/rancher/rke2/bin/crictl -r unix:///run/k3s/containerd/containerd.sock inspecti 8a06b7925250d | grep -i bci "com.suse.eula": "sle-bci", "com.suse.image-type": "sle-bci", "com.suse.sle.base.eula": "sle-bci", "com.suse.sle.base.image-type": "sle-bci",

rmammadli commented 1 year ago

Hi @rancher-max,

Thank you for your efforts!

Unfortunately there are still some errors while using multus with cilium.

thomasferrandiz commented 1 year ago

Hi @rancher-max,

Thank you for your efforts!

Unfortunately there are still some errors while using multus with cilium.

Hi @rmammadli could you tell us more about the errors you're still seeing?

Please note that the fix was validated by the QA team on the release candidate branch, but it's not released yet.

rmammadli commented 1 year ago

Hi @thomasferrandiz, sure, I have just tried with the same release of multus and whereabouts indicated in logs / tested by @rancher-max.

...
7m44s       Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Successfully assigned kube-system/rke2-multus-rke2-whereabouts-27850490-v4bsg to workload-cluster-test-03-wrk03-v20221128-0bd991222f57
7m44s       Normal    AddedInterface         pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Add eth0 [10.42.8.18/32] from cilium
7m44s       Normal    SuccessfulCreate       job/rke2-multus-rke2-whereabouts-27850490         Created pod: rke2-multus-rke2-whereabouts-27850490-v4bsg
7m41s       Normal    Started                pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Started container rke2-whereabouts
7m41s       Normal    Created                pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Created container rke2-whereabouts
7m41s       Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Container image "rancher/hardened-whereabouts:v0.5.3-build20221027" already present on machine
7m38s       Warning   BackOff                pod/rke2-multus-rke2-whereabouts-27850490-v4bsg   Back-off restarting failed container
7m39s       Normal    SuccessfulDelete       job/rke2-multus-rke2-whereabouts-27850490         Deleted pod: rke2-multus-rke2-whereabouts-27850490-v4bsg
7m39s       Warning   BackoffLimitExceeded   job/rke2-multus-rke2-whereabouts-27850490         Job has reached the specified backoff limit
2m44s       Normal    AddedInterface         pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Add eth0 [10.42.8.139/32] from cilium
2m44s       Normal    Scheduled              pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Successfully assigned kube-system/rke2-multus-rke2-whereabouts-27850495-ctpsc to workload-cluster-test-03-wrk03-v20221128-0bd991222f57
2m44s       Normal    SuccessfulCreate       job/rke2-multus-rke2-whereabouts-27850495         Created pod: rke2-multus-rke2-whereabouts-27850495-ctpsc
2m41s       Normal    Created                pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Created container rke2-whereabouts
2m41s       Normal    Started                pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Started container rke2-whereabouts
2m41s       Normal    Pulled                 pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Container image "rancher/hardened-whereabouts:v0.5.3-build20221027" already present on machine
2m38s       Warning   BackOff                pod/rke2-multus-rke2-whereabouts-27850495-ctpsc   Back-off restarting failed container
2m39s       Normal    SuccessfulDelete       job/rke2-multus-rke2-whereabouts-27850495         Deleted pod: rke2-multus-rke2-whereabouts-27850495-ctpsc
2m39s       Warning   BackoffLimitExceeded   job/rke2-multus-rke2-whereabouts-27850495         Job has reached the specified backoff limit

kubectl describe cronjob -n kube-system rke2-multus-rke2-whereabouts

Name:                          rke2-multus-rke2-whereabouts
Namespace:                     kube-system
Labels:                        app=rke2-whereabouts
                               app.kubernetes.io/instance=rke2-multus
                               app.kubernetes.io/managed-by=Helm
                               app.kubernetes.io/name=rke2-whereabouts
                               app.kubernetes.io/version=0.5.2
                               helm.sh/chart=rke2-whereabouts-0.1.1
Annotations:                   meta.helm.sh/release-name: rke2-multus
                               meta.helm.sh/release-namespace: kube-system
Schedule:                      */5 * * * *
Concurrency Policy:            Forbid
Suspend:                       False
Successful Job History Limit:  0
Failed Job History Limit:      1
Starting Deadline Seconds:     <unset>
Selector:                      <unset>
Parallelism:                   <unset>
Completions:                   <unset>
Pod Template:
  Labels:           app=rke2-whereabouts
                    app.kubernetes.io/instance=rke2-multus
                    app.kubernetes.io/name=rke2-whereabouts
                    name=whereabouts
  Service Account:  rke2-multus-rke2-whereabouts
  Containers:
   rke2-whereabouts:
    Image:      rancher/hardened-whereabouts:v0.5.3-build20221027
    Port:       <none>
    Host Port:  <none>
    Command:
      /ip-reconciler
      -log-level=verbose
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:        100m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
  Volumes:
   cni-net-dir:
    Type:               HostPath (bare host directory volume)
    Path:               /etc/cni/net.d
    HostPathType:       
  Priority Class Name:  system-node-critical
Last Schedule Time:     Wed, 14 Dec 2022 15:00:00 +0000
Active Jobs:            <none>
Events:
  Type    Reason            Age                    From                Message
  ----    ------            ----                   ----                -------
  Normal  SuccessfulCreate  3m7s (x57 over 4h43m)  cronjob-controller  (combined from similar events): Created job rke2-multus-rke2-whereabouts-27850500

For combination of multus & calico: I have also tested 'whereabouts' and it works as expected.

rancher-max commented 1 year ago

So this fails specifically with multus and cilium for you?

rmammadli commented 1 year ago

exactly @rancher-max, it fails for multus with cilium.

thomasferrandiz commented 1 year ago

I just tested with the latest RC (v1.25.5-rc4+rke2r1) with multus + cilium and it seems to work: kubectl describe cronjob -n kube-system rke2-multus-rke2-whereabouts

Name:                          rke2-multus-rke2-whereabouts
Namespace:                     kube-system
Labels:                        app=rke2-whereabouts
                               app.kubernetes.io/instance=rke2-multus
                               app.kubernetes.io/managed-by=Helm
                               app.kubernetes.io/name=rke2-whereabouts
                               app.kubernetes.io/version=0.5.3
                               helm.sh/chart=rke2-whereabouts-0.1.1
Annotations:                   meta.helm.sh/release-name: rke2-multus
                               meta.helm.sh/release-namespace: kube-system
Schedule:                      */5 * * * *
Concurrency Policy:            Forbid
Suspend:                       False
Successful Job History Limit:  0
Failed Job History Limit:      1
Starting Deadline Seconds:     <unset>
Selector:                      <unset>
Parallelism:                   <unset>
Completions:                   <unset>
Pod Template:
  Labels:           app=rke2-whereabouts
                    app.kubernetes.io/instance=rke2-multus
                    app.kubernetes.io/name=rke2-whereabouts
                    name=whereabouts
  Service Account:  rke2-multus-rke2-whereabouts
  Containers:
   rke2-whereabouts:
    Image:      rancher/hardened-whereabouts:v0.5.3-build20221027
    Port:       <none>
    Host Port:  <none>
    Command:
      /ip-reconciler
      -log-level=verbose
    Limits:
      cpu:     100m
      memory:  50Mi
    Requests:
      cpu:        100m
      memory:     50Mi
    Environment:  <none>
    Mounts:
      /host/etc/cni/net.d from cni-net-dir (rw)
  Volumes:
   cni-net-dir:
    Type:               HostPath (bare host directory volume)
    Path:               /etc/cni/net.d
    HostPathType:
  Priority Class Name:  system-node-critical
Last Schedule Time:     Thu, 15 Dec 2022 11:45:00 +0000
Active Jobs:            <none>
Events:
  Type    Reason            Age    From                Message
  ----    ------            ----   ----                -------
  Normal  SuccessfulCreate  14m    cronjob-controller  Created job rke2-multus-rke2-whereabouts-27851735
  Normal  SawCompletedJob   14m    cronjob-controller  Saw completed job: rke2-multus-rke2-whereabouts-27851735, status: Complete
  Normal  SuccessfulDelete  14m    cronjob-controller  Deleted job rke2-multus-rke2-whereabouts-27851735
  Normal  SuccessfulCreate  9m47s  cronjob-controller  Created job rke2-multus-rke2-whereabouts-27851740
  Normal  SawCompletedJob   9m42s  cronjob-controller  Saw completed job: rke2-multus-rke2-whereabouts-27851740, status: Complete
  Normal  SuccessfulDelete  9m42s  cronjob-controller  Deleted job rke2-multus-rke2-whereabouts-27851740
  Normal  SuccessfulCreate  4m47s  cronjob-controller  Created job rke2-multus-rke2-whereabouts-27851745
  Normal  SawCompletedJob   4m42s  cronjob-controller  Saw completed job: rke2-multus-rke2-whereabouts-27851745, status: Complete

@rmammadli I see in your log:

Labels:                        app=rke2-whereabouts
                               app.kubernetes.io/instance=rke2-multus
                               app.kubernetes.io/managed-by=Helm
                               app.kubernetes.io/name=rke2-whereabouts
                               app.kubernetes.io/version=0.5.2
                               helm.sh/chart=rke2-whereabouts-0.1.1

while in my log there is: app.kubernetes.io/version=0.5.3 so I think you are probably using the latest image but with the old chart which was buggy.

Could you try with the version of rke2 that I used? You can install it with this command:

 curl -sfL https://get.rke2.io | INSTALL_RKE2_VERSION=v1.25.5-rc4+rke2r1 sh -
rmammadli commented 1 year ago

you are right @thomasferrandiz, i used the old chart with newest images, which is buggy. Thank you all again for your efforts!

thomasferrandiz commented 1 year ago

@rmammadli you're welcome!