Open jerrinsg opened 1 year ago
Hitting the same issue on an Ubuntu VM as well:
$ python3 sieve.py -c examples/kapp-controller -w create -m learn --build-oracle
...
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get "http://localhost:10248/healthz": dial tcp [::1]:10248: connect: connection refused.
Unfortunately, an error has occurred:
timed out waiting for the condition
This error is likely caused by:
- The kubelet is not running
- The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
- 'systemctl status kubelet'
- 'journalctl -xeu kubelet'
Kubelet log:
Mar 10 20:19:55 kind-control-plane kubelet[283]: I0310 20:19:55.315958 283 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
Mar 10 20:19:55 kind-control-plane kubelet[283]: E0310 20:19:55.317086 283 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate signing request: Post "https://kind-control-plane:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp 172.18.0.3:6443: connect: connection refused
Mar 10 20:19:55 kind-control-plane kubelet[283]: W0310 20:19:55.320524 283 sysinfo.go:203] Nodes topology is not available, providing CPU topology
Mar 10 20:19:55 kind-control-plane kubelet[283]: Error: failed to run Kubelet: invalid configuration: cgroup ["kubelet"] has some missing paths: /sys/fs/cgroup/cpuacct/kubelet.slice, /sys/fs/cgroup/hugetlb/kubelet.slice, /sys/fs/cgroup/pids/kubelet.slice, /sys/fs/cgroup/cpuset/kubelet.slice, /sys/fs/cgroup/memory/kubelet.slice, /sys/fs/cgroup/cpu/kubelet.slice, /sys/fs/cgroup/systemd/kubelet.slice
Host details:
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
$ uname -a
Linux jerrin-virtual-machine 5.19.0-35-generic #36~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Feb 17 15:17:25 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
It's worth sharing a note about the workaround here too (to rebuild the image).
On Mac, building the Kind image locally and running Sieve again fixed this issue:
$ python3 build.py -v v1.24.10 -m learn
..
Image "kindest/node:latest" build completed.
$ python3 build.py -v v1.24.10 -m test
..
Image "kindest/node:latest" build completed.
$ python3 sieve.py -c examples/kapp-controller -w create -m learn --build-oracle
...
Generated 8 intermediate-state test plan(s) in sieve_learn_results/kapp-controller/create/learn/intermediate-state
Total time: 410.3174147605896 seconds
When I run the command python3 sieve.py -c examples/kapp-controller -w create -m learn --build-oracle
, I get the following error:
"ERROR: image: "ghcr.io/sieve-project/action/kapp-controller:learn" not present locally
Cannot load image ghcr.io/sieve-project/action/kapp-controller:learn locally, try to pull from remote
Error response from daemon: Head "https://ghcr.io/v2/sieve-project/action/kapp-controller/manifests/learn": denied
[FAIL] docker pull ghcr.io/sieve-project/action/kapp-controller:learn"
@kapilagrawal95 The kapp-controller image is not in our github repo. You might need to build it and push it to your repo first. You can configure the repo name here: https://github.com/sieve-project/sieve/blob/main/config.json#L2
I am hitting issues when trying to run Sieve with kapp-controller.
I am able to build the controller image successfully:
But running Sieve with kapp-controller in learn mode fails:
(full logs attached in kapp-learn.err.txt)
See kubelet-log.txt for the logs exported by kind (
kind export logs
).I'm trying this on a Mac