kubernetes / minikube

Run Kubernetes locally
https://minikube.sigs.k8s.io/
Apache License 2.0
29.5k stars 4.89k forks source link

Stuck on Container Checkpointing – Need Guidance #19686

Closed binaryBard97 closed 1 month ago

binaryBard97 commented 1 month ago

What Happened?

Actual behavior: When starting minikube with : --extra-config=kubelet.feature-gates=ContainerCheckpoint=true

sudo journalctl -u kubelet | grep "ContainerCheckpoint" returns nothing.

Expected behavior: ContainerCheckpoint to be enabled in /var/lib/kubelet/config.yaml and sudo journalctl -u kubelet | grep "ContainerCheckpoint" to return something.

And sudo journalctl -u kubelet logs shows an error : Sep 22 15:33:27 minikube kubelet[1133]: E0922 15:33:27.815325 1133 run.go:72] "command failed" err="failed to load kubelet config file, path: /var/lib/kubelet/config.yaml, error: failed to load Kubelet config file /var/lib/kubelet/config.yaml, error failed to read kubelet config file \"/var/lib/kubelet/config.yaml\", error: open /var/lib/kubelet/config.yaml: no such file or directory"

but inside minikube cluster, the config is PRESENT at /var/lib/kubelet root@minikube:/var/lib/kubelet# but I do not see my feature gate enabled

Attach the log file

logs.txt kubelet-logs.txt

Operating System

macOS (Default)

Driver

Docker

binaryBard97 commented 1 month ago

/kind support

spowelljr commented 1 month ago

cri-o is failing to start time="2024-09-21T17:10:25Z" level=fatal msg="flag provided but not defined: -enable-criu-support" not sure where that flag is coming from. Could you delete the cluster (minikube delete) and then try starting it again? Could you also see if it works with a different CRI like containerd.

binaryBard97 commented 1 month ago

I've tried both approaches, and I've attached the log file for the second method. However, I still don't see my feature gate enabled in /var/lib/kubelet/config.yaml

I also uninstalled kubectl, minikube and reinstalled with brew and updated docker desktop as well.

log.txt

spowelljr commented 1 month ago

The kubelet feature flags are stored to a drop-in config file located at: /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

$ minikube start --extra-config=kubelet.feature-gates=ContainerCheckpoint=true
😄  minikube v1.34.0 on Darwin 14.7 (arm64)
✨  Automatically selected the docker driver. Other choices: qemu2, ssh, vfkit (experimental)
📌  Using Docker Desktop driver with root privileges
👍  Starting "minikube" primary control-plane node in "minikube" cluster
🚜  Pulling base image v0.0.45-1727108449-19696 ...
🔥  Creating docker container (CPUs=2, Memory=4000MB) ...
🐳  Preparing Kubernetes v1.31.1 on Docker 27.3.1 ...
    ▪ Generating certificates and keys ...
    ▪ Booting up control plane ...
    ▪ Configuring RBAC rules ...
🔗  Configuring bridge CNI (Container Networking Interface) ...
🔎  Verifying Kubernetes components...
    ▪ Using image gcr.io/k8s-minikube/storage-provisioner:v5
🌟  Enabled addons: storage-provisioner, default-storageclass
🏄  Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default

$ minikube ssh -- sudo cat /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
[Unit]
Wants=docker.socket

[Service]
ExecStart=
ExecStart=/var/lib/minikube/binaries/v1.31.1/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/var/lib/kubelet/config.yaml --feature-gates=ContainerCheckpoint=true --hostname-override=minikube --kubeconfig=/etc/kubernetes/kubelet.conf --node-ip=192.168.49.2

[Install]

$ minikube ssh -- sudo systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; disabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: active (running) since Wed 2024-09-25 23:32:08 UTC; 6min ago
       Docs: http://kubernetes.io/docs/
   Main PID: 2319 (kubelet)
      Tasks: 18 (limit: 9399)
     Memory: 38.4M
        CPU: 11.119s
     CGroup: /system.slice/kubelet.service
             └─2319 /var/lib/minikube/binaries/v1.31.1/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --config=/var/lib/kubelet/config.yaml --feature-gates=ContainerCheckpoint=true --hostname-override=minikube --kubeconfig=/etc/kubernetes/kubelet.conf --node-ip=192.168.49.2

Sep 25 23:32:14 minikube kubelet[2319]: I0925 23:32:14.486693    2319 reconciler_common.go:245] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-modules\" (UniqueName: \"kubernetes.io/host-path/c0af5f05-c91d-4b60-a516-c22c236e94b5-lib-modules\") pod \"kube-proxy-bnz8j\" (UID: \"c0af5f05-c91d-4b60-a516-c22c236e94b5\") " pod="kube-system/kube-proxy-bnz8j"
Sep 25 23:32:14 minikube kubelet[2319]: I0925 23:32:14.687717    2319 reconciler_common.go:245] "operationExecutor.VerifyControllerAttachedVolume started for volume \"config-volume\" (UniqueName: \"kubernetes.io/configmap/2796fc7b-1448-4fc3-b0c3-34ee662ea9bb-config-volume\") pod \"coredns-7c65d6cfc9-lck4f\" (UID: \"2796fc7b-1448-4fc3-b0c3-34ee662ea9bb\") " pod="kube-system/coredns-7c65d6cfc9-lck4f"
Sep 25 23:32:14 minikube kubelet[2319]: I0925 23:32:14.687755    2319 reconciler_common.go:245] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-jpchn\" (UniqueName: \"kubernetes.io/projected/2796fc7b-1448-4fc3-b0c3-34ee662ea9bb-kube-api-access-jpchn\") pod \"coredns-7c65d6cfc9-lck4f\" (UID: \"2796fc7b-1448-4fc3-b0c3-34ee662ea9bb\") " pod="kube-system/coredns-7c65d6cfc9-lck4f"
Sep 25 23:32:14 minikube kubelet[2319]: I0925 23:32:14.956589    2319 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/kube-proxy-bnz8j" podStartSLOduration=0.956576845 podStartE2EDuration="956.576845ms" podCreationTimestamp="2024-09-25 23:32:14 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2024-09-25 23:32:14.956360012 +0000 UTC m=+6.101630670" watchObservedRunningTime="2024-09-25 23:32:14.956576845 +0000 UTC m=+6.101847504"
Sep 25 23:32:14 minikube kubelet[2319]: I0925 23:32:14.961672    2319 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/storage-provisioner" podStartSLOduration=4.96166122 podStartE2EDuration="4.96166122s" podCreationTimestamp="2024-09-25 23:32:10 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2024-09-25 23:32:14.96161372 +0000 UTC m=+6.106884379" watchObservedRunningTime="2024-09-25 23:32:14.96166122 +0000 UTC m=+6.106931837"
Sep 25 23:32:15 minikube kubelet[2319]: I0925 23:32:15.985756    2319 pod_startup_latency_tracker.go:104] "Observed pod startup duration" pod="kube-system/coredns-7c65d6cfc9-lck4f" podStartSLOduration=1.9857293870000001 podStartE2EDuration="1.985729387s" podCreationTimestamp="2024-09-25 23:32:14 +0000 UTC" firstStartedPulling="0001-01-01 00:00:00 +0000 UTC" lastFinishedPulling="0001-01-01 00:00:00 +0000 UTC" observedRunningTime="2024-09-25 23:32:15.985553887 +0000 UTC m=+7.130824546" watchObservedRunningTime="2024-09-25 23:32:15.985729387 +0000 UTC m=+7.131000046"
Sep 25 23:32:19 minikube kubelet[2319]: I0925 23:32:19.077530    2319 kuberuntime_manager.go:1635] "Updating runtime config through cri with podcidr" CIDR="10.244.0.0/24"
Sep 25 23:32:19 minikube kubelet[2319]: I0925 23:32:19.078851    2319 kubelet_network.go:61] "Updating Pod CIDR" originalPodCIDR="" newPodCIDR="10.244.0.0/24"
Sep 25 23:32:23 minikube kubelet[2319]: I0925 23:32:23.033593    2319 prober_manager.go:312] "Failed to trigger a manual run" probe="Readiness"
Sep 25 23:32:45 minikube kubelet[2319]: I0925 23:32:45.241997    2319 scope.go:117] "RemoveContainer" containerID="b1b98474dc23b20a1124f5e1fa2fcbd1be68859a4096273f827092ed60b301f3"
binaryBard97 commented 1 month ago

Sorry and thanks! I know this isn't the right place, but I've been stuck for days and I'm new to Kubernetes.

binaryBard97 commented 1 month ago

Hello,

I'm trying to update the CRI-O runtime to version 1.25 or higher. I need this for setting up my environment in Minikube to checkpoint containers, following the instructions in this article. The article mentions that version 1.25 of CRI-O supports forensic container checkpointing.

Could you please guide me on how to update the CRI-O runtime to the required version?


minikube start --driver=docker --container-runtime=cri-o --extra-config=kubelet.feature-gates=ContainerCheckpoint=true minikube ssh

Screenshot 2024-09-28 at 9 56 32 PM

GitID: 5c35d75

Thanks.

binaryBard97 commented 1 month ago

not a minikube issue