Closed vsoch closed 1 year ago
Yes although complicated. Here is an example: https://github.com/rootless-containers/usernetes/tree/v20230518.0#multi-node-docker-compose
Hey! Just wanted to give a quick update because it's been a long time. I was finally able to get over some VM hurdles on GCP (with the kernel modules loading and the uid/gid setup) and now I have a terraform build that can (on one node) install and run kubectl to get that node. I'm looking at https://github.com/rootless-containers/usernetes/blob/v20230518.0/docker-compose.yml and if I understand this, I think I need to generate the certificates (to be seen by all nodes) and then run different commands on different nodes. Hopefully will make some time this week!
okay trying to reproduce what I see in the docker-compose! Here is the batch script - basically this gets run under one job and then you get usernetes running under an allocation.
#!/bin/bash
# Final steps to setting up usernetes
# These steps will vary based on the hostname
# We also need the node names - not ideal, but for now they are predictable so it works
node_master=gffw-compute-a-001
node_crio=gffw-compute-a-002
node_containerd=gffw-compute-a-003
# What node is running this?
nodename=$(hostname)
# Install usernetes on all nodes (fuse3 and wget are already installed)
wget https://github.com/rootless-containers/usernetes/releases/download/v20230518.0/usernetes-x86_64.tbz
tar xjvf usernetes-x86_64.tbz
cd usernetes
# Run this on the main login node - since it's shared we only need
# to generate the certs once.
if [[ "$nodename" == *"001"* ]]; then
echo "I am ${nodename} going to run the master stuff"
/bin/bash ./common/cfssl.sh --dir=/home/$USER/.config/usernetes --master=${node_master} --node=${node_crio} --node=${node_containerd}
# 2379/tcp: etcd, 6443/tcp: kube-apiserver
/bin/bash ./install.sh --wait-init-certs --start=u7s-master-with-etcd.target --cidr=10.0.100.0/24 --publish=0.0.0.0:2379:2379/tcp --publish=0.0.0.0:6443:6443/tcp --cni=flannel --cri=crio
fi
# The first compute node runs crio
if [[ "$nodename" == *"${node_crio}"* ]]; then
echo "I am compute node ${nodename} going to run crio"
# 10250/tcp: kubelet, 8472/udp: flannel
/bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.101.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=crio
fi
# The second compute node runs crio
if [[ "$nodename" == *"${node_containerd}"* ]]; then
echo "I am compute node ${nodename} going to run containerd"
# 10250/tcp: kubelet, 8472/udp: flannel
/bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.102.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=containerd
fi
The first (master) runs OK to generate certs and the second part (the .config in $HOME is a shared NFS filesystem). I'm debugging the second node (crio) and the first issue I ran into is this line in install.sh:
- U7S_ROOTLESSKIT_PORTS=${publish}
+ U7S_ROOTLESSKIT_PORTS="${publish}"
As is, it results in an invalid formatting:
- U7S_ROOTLESSKIT_PORTS= 0.0.0.0:10250:10250/tcp 0.0.0.0:8472:8472/udp
+ U7S_ROOTLESSKIT_PORTS=" 0.0.0.0:10250:10250/tcp 0.0.0.0:8472:8472/udp"
When I add quotes around that (assuming it's OK) and try again, it fails again, and I trace it to this:
$ $HOME/usernetes/boot/kube-proxy.sh
[INFO] Entering RootlessKit namespaces: OK
E0717 20:40:10.637050 262 server.go:494] "Error running ProxyServer" err="stat $HOME/.config/usernetes/node/kube-proxy.kubeconfig: no such file or directory"
E0717 20:40:10.637097 262 run.go:74] "command failed" err="stat $HOME/.config/usernetes/node/kube-proxy.kubeconfig: no such file or directory"
Where is that kube proxy kubeconfig supposed to be generated? Does any of this look familiar / in the right (or wrong) direction and can you advise @AkihiroSuda ?
Ah this is helpful! So it seems the issue is that the setup is looking for the file to be under a "node" directory but it's generated under a name that is explicitly for the hostname:
$ find . -name kube-proxy*
./.config/usernetes/master/kube-proxy.pem
./.config/usernetes/master/kube-proxy-key.pem
./.config/usernetes/master/kube-proxy.csr
./.config/usernetes/master/kube-proxy.kubeconfig
./.config/usernetes/nodes.gffw-compute-a-002/kube-proxy.kubeconfig
./.config/usernetes/nodes.gffw-compute-a-003/kube-proxy.kubeconfig
The above script looks correct (it generates the files above) but the error must be in install.sh that is assuming a directory called "node" that I don't see.
$ ls .config/usernetes/
containers env nodes.gffw-compute-a-002
crio master nodes.gffw-compute-a-003
hey @AkihiroSuda I see the issue! The script does a bind of the appropriate certs directory (either for crio or the other one) always to "node" - could we please expose these paths as a variable (that each default to node, expecting the docker-compose setup?) I think I'm getting close to this working - I was able to get all services started and at least list the main master node, and I'd like to test fresh with this fix.
Here are the places I see it hard coded (docker-compose removed since that should stay!)
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ grep -R usernetes/node
boot/.nfs000000001700008d00000003: --etcd-endpoints https://$(cat $XDG_CONFIG_HOME/usernetes/node/master):2379 \
boot/kube-proxy.sh: kubeconfig: "$XDG_CONFIG_HOME/usernetes/node/kube-proxy.kubeconfig"
boot/kubelet.sh: clientCAFile: "$XDG_CONFIG_HOME/usernetes/node/ca.pem"
boot/kubelet.sh:tlsCertFile: "$XDG_CONFIG_HOME/usernetes/node/node.pem"
boot/kubelet.sh:tlsPrivateKeyFile: "$XDG_CONFIG_HOME/usernetes/node/node-key.pem"
boot/kubelet.sh: --kubeconfig $XDG_CONFIG_HOME/usernetes/node/node.kubeconfig \
boot/flanneld.sh: --etcd-endpoints https://$(cat $XDG_CONFIG_HOME/usernetes/node/master):2379 \
install.sh: if [[ -f ${config_dir}/usernetes/node/done || -f ${config_dir}/usernetes/master/done ]]; then
install.sh: cp -r "${cfssldir}/nodes.$node" ${config_dir}/usernetes/node
The second to last line looks promising, but I looked closer and this is intended for a single node cluster (so the condition isn't hit, and I suspect this is what the docker-compose might use?)
if [[ -n "$wait_init_certs" ]]; then
max_trial=300
INFO "Waiting for certs to be created.":
for ((i = 0; i < max_trial; i++)); do
if [[ -f ${config_dir}/usernetes/node/done || -f ${config_dir}/usernetes/master/done ]]; then
echo "OK"
break
fi
echo -n .
sleep 5
done
elif [[ ! -d ${config_dir}/usernetes/master ]]; then
### If the keys are not generated yet, generate them for the single-node cluster
INFO "Generating single-node cluster TLS keys (${config_dir}/usernetes/{master,node})"
cfssldir=$(mktemp -d /tmp/cfssl.XXXXXXXXX)
master=127.0.0.1
node=$(hostname)
${base}/common/cfssl.sh --dir=${cfssldir} --master=$master --node=$node,127.0.0.1
rm -rf ${config_dir}/usernetes/{master,node}
cp -r "${cfssldir}/master" ${config_dir}/usernetes/master
cp -r "${cfssldir}/nodes.$node" ${config_dir}/usernetes/node
rm -rf "${cfssldir}"
fi
Going to blow everything up and start from the beginning again tomorrow, just to try and reproduce at least the services all starting. I think this issue here is that I got it working for a single node setup but actually have different machines.
Tested again - it seems that the issue (aside from that "node" directory under the .configs and the typo in install.sh that there are missing quotes around ${publish}
is that unless I start crio/containerd on that same master node, the cluster doesn't hook up. The setup I created is here: https://github.com/converged-computing/flux-terraform-gcp/tree/usernetes/examples/usernetes and more specifically after doing a clone, copying the .config directory, I am following the logic here: https://github.com/converged-computing/flux-terraform-gcp/blob/usernetes/examples/usernetes/scripts/batch.sh. Note that the only reason the README logs seem to be working is because when I first ran them, I had the manually started all the different services on the master node (and then I saw it come up under kubectl get nodes
. When I try to follow the logic / commands from the docker-compose, I see the various K8s objects created, but it hangs and times out on the last part. Starting the other two nodes (crio and containerd) akin to the docker-compose doesn't have obvious errors, but kubectl get nodes
still doesn't work. And sometimes I see warnings:
[WARNING] Kernel module x_tables not loaded
[WARNING] Kernel module xt_MASQUERADE not loaded
[WARNING] Kernel module xt_tcpudp not loaded
But not always! I'm out of ideas, I hope a maintainer here can advise. Thank you!
hey @AkihiroSuda we are really interested in this use case, and I can offer to help. Can we step back and talk about what would be needed to get this working on different nodes? Would you have time to look at the above (what we have closer to working) and talk about maybe a step 1 we can take to get this working? E.g., I'd say maybe we could start by tweaking the scripts so they aren't hard coded for docker compose. What do you think?
I see the various K8s objects created, but it hangs and times out on the last part.
Any error in the logs of containerd, kubelet, kube-apiserver?
I will bring this up and report back! Aside from the console, I'm guessing I can find logs by poking around the usernetes directory in the user home.
journalctl --user --no-pager -f
should be useful to get the logs
okay on the node 001 (the first master node) I am running the first part of script/batch.sh in my linked PR. This part:
echo "I am ${nodename} going to run the master stuff"
/bin/bash ./common/cfssl.sh --dir=/home/$USER/.config/usernetes --master=${node_master} --node=${node_crio} --node=${node_containerd}
# The script /home/sochat1_llnl_gov/usernetes/boot/kube-proxy.sh is asking for a non-existent
# "$XDG_CONFIG_HOME/usernetes/node/kube-proxy.kubeconfig", so we are going to arbitrarily make it
# I did a diff of the two kube-proxy.kubectl and they are the same
cp -R ~/.config/usernetes/nodes.$node_crio ~/.config/usernetes/node
# 2379/tcp: etcd, 6443/tcp: kube-apiserver
# This first install will timeout because configs are missing, but we need to generate the first ones!
/bin/bash ./install.sh --wait-init-certs --start=u7s-master-with-etcd.target --cidr=10.0.100.0/24 --publish=0.0.0.0:2379:2379/tcp --publish=0.0.0.0:6443:6443/tcp --cni=flannel --cri=
And here is the first timeout:
[INFO] Installing CoreDNS
+ sleep 3
+ kubectl get nodes -o wide
No resources found
+ kubectl apply -f /home/sochat1_llnl_gov/usernetes/manifests/coredns.yaml
serviceaccount/coredns created
clusterrole.rbac.authorization.k8s.io/system:coredns created
clusterrolebinding.rbac.authorization.k8s.io/system:coredns created
configmap/coredns created
deployment.apps/coredns created
service/kube-dns created
+ set +x
[INFO] Waiting for CoreDNS pods to be available
+ sleep 3
+ kubectl -n kube-system wait --for=condition=ready pod -l k8s-app=kube-dns
timed out waiting for the condition on pods/coredns-8557665db-mb5wt
timed out waiting for the condition on pods/coredns-8557665db-qzjq9
I don't see any logs with that command:
$ sudo journalctl --user --no-pager -f
No journal files were found.
And without sudo there is not sufficient permissions.
Here is what the systemctl shows:
$ systemctl --user --all --no-pager list-units 'u7s-*'
UNIT LOAD ACTIVE SUB DESCRIPTION
u7s-etcd.service loaded active running Usernetes etcd service
u7s-kube-apiserver.service loaded active running Usernetes kube-apiserver service
u7s-kube-controller-manager.service loaded active running Usernetes kube-controller-manager service
u7s-kube-scheduler.service loaded active running Usernetes kube-scheduler service
u7s-rootlesskit.service loaded active running Usernetes RootlessKit service
u7s-etcd.target loaded active active Usernetes target for etcd
u7s-master-with-etcd.target loaded active active Usernetes target for Kubernetes master components
u7s-master.target loaded active active Usernetes target for Kubernetes master components
● u7s-node.target not-found inactive dead u7s-node.target
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
9 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
The containerd node:
$ /bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.102.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=containerd
[INFO] Rootless cgroup (v2) is supported
[WARNING] Kernel module x_tables not loaded
[WARNING] Kernel module xt_MASQUERADE not loaded
[WARNING] Kernel module xt_tcpudp not loaded
[INFO] Waiting for certs to be created.:
OK
[INFO] Base dir: /home/sochat1_llnl_gov/usernetes
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-master-with-etcd.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-rootlesskit.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-etcd.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-etcd.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-master.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-apiserver.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-controller-manager.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-scheduler.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-node.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-containerd-fuse-overlayfs-grpc.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kubelet-containerd.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-proxy.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-flanneld.service
[INFO] Starting u7s-node.target
+ systemctl --user -T enable u7s-node.target
Created symlink /home/sochat1_llnl_gov/.config/systemd/user/u7s.target.wants/u7s-node.target → /home/sochat1_llnl_gov/.config/systemd/user/u7s-node.target.
+ systemctl --user -T start u7s-node.target
Enqueued anchor job 19 u7s-node.target/start.
Enqueued auxiliary job 32 u7s-kubelet-containerd.service/start.
Enqueued auxiliary job 29 u7s-rootlesskit.service/start.
Enqueued auxiliary job 20 u7s-containerd-fuse-overlayfs-grpc.service/start.
Enqueued auxiliary job 30 u7s-flanneld.service/start.
Enqueued auxiliary job 31 u7s-kube-proxy.service/start.
real 0m1.793s
user 0m0.001s
sys 0m0.003s
+ systemctl --user --all --no-pager list-units 'u7s-*'
UNIT LOAD ACTIVE SUB DESCRIPTION
u7s-containerd-fuse-overlayfs-grpc.service loaded active running Usernetes containerd-fuse-overlayfs-grpc service
u7s-etcd.service loaded inactive dead Usernetes etcd service
u7s-flanneld.service loaded active running Usernetes flanneld service
u7s-kube-apiserver.service loaded inactive dead Usernetes kube-apiserver service
u7s-kube-controller-manager.service loaded inactive dead Usernetes kube-controller-manager service
u7s-kube-proxy.service loaded active running Usernetes kube-proxy service
u7s-kube-scheduler.service loaded inactive dead Usernetes kube-scheduler service
u7s-kubelet-containerd.service loaded active running Usernetes kubelet service (containerd)
u7s-rootlesskit.service loaded active running Usernetes RootlessKit service (containerd)
u7s-etcd.target loaded inactive dead Usernetes target for etcd
u7s-master-with-etcd.target loaded inactive dead Usernetes target for Kubernetes master components (including etcd)
u7s-master.target loaded inactive dead Usernetes target for Kubernetes master components
u7s-node.target loaded active active Usernetes target for Kubernetes node components (containerd)
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
13 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
+ set +x
[INFO] Installation complete.
[INFO] Hint: `sudo loginctl enable-linger` to start user services automatically on the system start up.
And the node for crio:
$ /bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.101.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=crio
[INFO] Rootless cgroup (v2) is supported
[WARNING] Kernel module x_tables not loaded
[WARNING] Kernel module xt_MASQUERADE not loaded
[WARNING] Kernel module xt_tcpudp not loaded
[INFO] Waiting for certs to be created.:
OK
[INFO] Base dir: /home/sochat1_llnl_gov/usernetes
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-master-with-etcd.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-rootlesskit.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-etcd.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-etcd.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-master.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-apiserver.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-controller-manager.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-scheduler.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-node.target
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kubelet-crio.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-kube-proxy.service
[INFO] Installing /home/sochat1_llnl_gov/.config/systemd/user/u7s-flanneld.service
[INFO] Starting u7s-node.target
+ systemctl --user -T enable u7s-node.target
+ systemctl --user -T start u7s-node.target
Enqueued anchor job 10 u7s-node.target/start.
Enqueued auxiliary job 22 u7s-flanneld.service/start.
Enqueued auxiliary job 21 u7s-rootlesskit.service/start.
Enqueued auxiliary job 11 u7s-kube-proxy.service/start.
Enqueued auxiliary job 13 u7s-kubelet-crio.service/start.
real 0m1.588s
user 0m0.003s
sys 0m0.002s
+ systemctl --user --all --no-pager list-units 'u7s-*'
UNIT LOAD ACTIVE SUB DESCRIPTION
u7s-etcd.service loaded inactive dead Usernetes etcd service
u7s-flanneld.service loaded active running Usernetes flanneld service
u7s-kube-apiserver.service loaded inactive dead Usernetes kube-apiserver service
u7s-kube-controller-manager.service loaded inactive dead Usernetes kube-controller-manager service
u7s-kube-proxy.service loaded active running Usernetes kube-proxy service
u7s-kube-scheduler.service loaded inactive dead Usernetes kube-scheduler service
u7s-kubelet-crio.service loaded active running Usernetes kubelet service (crio)
u7s-rootlesskit.service loaded active running Usernetes RootlessKit service (crio)
u7s-etcd.target loaded inactive dead Usernetes target for etcd
u7s-master-with-etcd.target loaded inactive dead Usernetes target for Kubernetes master components (including etcd)
u7s-master.target loaded inactive dead Usernetes target for Kubernetes master components
u7s-node.target loaded active active Usernetes target for Kubernetes node components (crio)
LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
12 loaded units listed.
To show all installed unit files use 'systemctl list-unit-files'.
+ set +x
[INFO] Installation complete.
[INFO] Hint: `sudo loginctl enable-linger` to start user services automatically on the system start up.
[sochat1_llnl_gov@gffw-compute-a-003 usernetes]$ sudo loginctl enable-linger
I can leave this up a little bit if you want to tell me where to look!
And without sudo there is not sufficient permissions.
What is the error and what is your distro?
permissions (why I added sudo)
$ journalctl --user --no-pager -f
Hint: You are currently not seeing messages from the system.
Users in the 'systemd-journal' group can see all messages. Pass -q to
turn off this notice.
No journal files were opened due to insufficient permissions.
This is Rocky Linux 8
NAME="Rocky Linux"
VERSION="8.8 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.8 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.8"
Users in the 'systemd-journal' group can see all messages.
Can you try adding yourself to this group?
I already did, it doesn't change anything.
$ echo $USER
sochat1_llnl_gov
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ sudo usermod -a -G systemd-journal $USER
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ journalctl --user --no-pager -f
Hint: You are currently not seeing messages from the system.
Users in the 'systemd-journal' group can see all messages. Pass -q to
turn off this notice.
No journal files were opened due to insufficient permissions.
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$
okay logged out and in! That message went away, but still no logs:
$ journalctl --user --no-pager -f
No journal files were found.
I think at least found them, and a way to read a file:
$ journalctl --file /run/log/journal/$(cat /etc/machine-id)/system.journal
-- Logs begin at Fri 2023-08-04 02:24:17 UTC, end at Fri 2023-08-04 02:24:17 UTC. --
Aug 04 02:24:17 gffw-compute-a-001 systemd-journald[10040]: Journal started
Aug 04 02:24:17 gffw-compute-a-001 systemd-journald[10040]: Runtime journal (/run/log/journal/a699437c101cde4ba34d>
Aug 04 02:24:17 gffw-compute-a-001 audit[10040]: EVENT_LISTENER pid=10040 uid=0 auid=501043911 tty=pts0 ses=5 subj>
Aug 04 02:24:17 gffw-compute-a-001 audit[10040]: SYSCALL arch=c000003e syscall=49 success=yes exit=0 a0=8 a1=5572f>
Aug 04 02:24:17 gffw-compute-a-001 audit: PROCTITLE proctitle="/usr/lib/systemd/systemd-journald"
Aug 04 02:24:17 gffw-compute-a-001 audit: CONFIG_CHANGE op=set audit_enabled=1 old=1 auid=501043911 ses=5 subj=unc>
Aug 04 02:24:17 gffw-compute-a-001 audit[10040]: SYSCALL arch=c000003e syscall=46 success=yes exit=60 a0=8 a1=7ffe>
Aug 04 02:24:17 gffw-compute-a-001 audit: PROCTITLE proctitle="/usr/lib/systemd/systemd-journald"
Here is one of the weirdly named files:
And the other one:
The way I was debugging this before is with systemctl status, but I'm not sure what I'm looking for so it was hard to know where to look.
Using Ubuntu (23.04 or 22.04) might be easier
I think I agree. Okay - so here is a plan. I'll rebuild the cluster (it has a base VM) using ubuntu. Then I'll bring it up and see if the problem reproduces (and if we can get logs!) My suggestion after that is to try making a PR that doesn't have the node directories hard coded as "node" as I suspect the error might be coming from there. E.g, this line: https://github.com/converged-computing/flux-terraform-gcp/blob/26822feed8435f27d184c7a2cd4200614228824e/examples/usernetes/scripts/batch.sh#L28 I have to do because it's looking for the actual node name (but docker-compose hard codes all as "node").
What do you think? After I create the ubuntu cluster and try again I can report back and then we can figure out that next step.
/cc
Thank you again to you both! I almost have the debian setup done, although I'm not exactly a morning person (got up to chat with you!) so I'll probably go back to sleep for a bit and be in touch later today / this weekend, and of course this means we can chat more next week. Happy Friday and have an amazing weekend!
@AkihiroSuda I struggled all day trying to get a debian/ubuntu setup (yes, pathetic, GCP with terraform and the foundation network setup seems to not create the eth0, and I need to ping some Google colleagues to ask about this) BUT I went back to Rocky and (I think?) found a way to view the logs! We can just look at /var/log/messages
and I think I'm seeing everything in there? Here you go!
I haven't looked closely yet - need to eat dinner but I will during!
okay here are some blobs that stand out to me (and might be useful for debugging):
This could be more of a warning:
Aug 5 03:14:07 gffw-compute-a-001 kube-apiserver.sh[9039]: E0805 03:14:07.554361 100 instance.go:388] Could not construct pre-rendered responses for ServiceAccountIssuerDiscovery endpoints. Endpoints will not be enabled. Error: issuer URL must use https scheme, got: kubernetes.default.svc
This looks like an issue - should this be created in advance for the rootless use case?
Aug 5 03:14:10 gffw-compute-a-001 kube-controller-manager.sh[9092]: W0805 03:14:10.660277 118 probe.go:268] Flexvolume plugin directory at /usr/libexec/kubernetes/kubelet-plugins/volume/exec/ does not exist. Recreating.
Aug 5 03:14:10 gffw-compute-a-001 kube-controller-manager.sh[9092]: E0805 03:14:10.660347 118 plugins.go:609] "Error initializing dynamic plugin prober" err="error (re-)creating driver directory: mkdir /usr/libexec/kubernetes: permission denied"
I'll see if I can lookup what that is for (and try creating it). If CORE DNS (or other plugins) need to write stuff there, that could be an issue. And then this is the last explicit error:
Aug 5 03:14:07 gffw-compute-a-001 kube-apiserver.sh[9039]: I0805 03:14:07.546078 100 instance.go:282] Using reconciler: lease
Aug 5 03:14:07 gffw-compute-a-001 kube-apiserver.sh[9039]: E0805 03:14:07.554361 100 instance.go:388] Could not construct pre-rendered responses for ServiceAccountIssuerDiscovery endpoints. Endpoints will not be enabled. Error: issuer URL must use https scheme, got: kubernetes.default.svc
Aug 5 03:14:07 gffw-compute-a-001 kube-apiserver.sh[9039]: I0805 03:14:07.678197 100 handler.go:232] Adding GroupVersion v1 to ResourceManager
If that is just for service endpoints (which we haven't cared about yet) it's probably not the issue.
okay debugging - the kube-controller-manager.sh is here: https://github.com/rootless-containers/usernetes/blob/58df6ea63cc4a00425b80a088889015eedc96320/boot/kube-controller-manager.sh#L4 and it's calling nsenter.sh, but that seems to be more of a wrapper, and it's calling kube-controller-manager. Searching around for those files, the first probe.go seems to be here:
And then the plugins.go is here:
But it looks like it's still initializing some dummy plugins prober, so maybe this isn't an actual error after all?
pm.prober = &dummyPluginProber{}
I see mention of CoreDNS at the bottom but nothing looks terribly off?
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.414243 118 event.go:307] "Event occurred" object="kube-system/coredns" fieldPath="" kind="Deployment" apiVersion="apps/v1" type="Normal" reason="ScalingReplicaSet" message="Scaled up replica set coredns-8557665db to 2"
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.565153 118 event.go:307] "Event occurred" object="kube-system/coredns-8557665db" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: coredns-8557665db-ckfzd"
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.576527 118 shared_informer.go:318] Caches are synced for garbage collector
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.577719 118 event.go:307] "Event occurred" object="kube-system/coredns-8557665db" fieldPath="" kind="ReplicaSet" apiVersion="apps/v1" type="Normal" reason="SuccessfulCreate" message="Created pod: coredns-8557665db-9rzc4"
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.608960 118 shared_informer.go:318] Caches are synced for garbage collector
Aug 5 03:14:14 gffw-compute-a-001 kube-controller-manager.sh[9092]: I0805 03:14:14.608985 118 garbagecollector.go:166] "All resource monitors have synced. Proceeding to collect garbage"
If we are seeing successful create there, it could be whatever wait logic is checking for them has a bug.
But what is the error? Which component is failing to start?
The error is the timeout shown in this comment: https://github.com/rootless-containers/usernetes/issues/281#issuecomment-1664866664
Which, given the log above that it started, I am wondering if the script that determines there is a timeout is issuing the wrong command / wrong permission to test. In other testing when I bring up the other services on the same node I see this master run to completion (with instructions to export my config path, etc.)
That part of the script is here: https://github.com/rootless-containers/usernetes/blob/58df6ea63cc4a00425b80a088889015eedc96320/install.sh#L457 I’ll see if I can do some debugilging around that - after midnight here so I’ll turn into a pumpkin soon for sure! 🎃
okay added some more debugging - the pods do seem to be created (when I list all namespaces) but they are pending:
+ kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-8557665db-t257d 0/1 Pending 0 8s
kube-system coredns-8557665db-wbj64 0/1 Pending 0 8s
okay so it's failing because there are no nodes to schedule pods:
Conditions:
Type Status
PodScheduled False
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: coredns
Optional: false
kube-api-access-sqv94:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: CriticalAddonsOnly op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 112s default-scheduler no nodes available to schedule pods
+ kubectl -n kube-system wait --for=condition=ready pod -l k8s-app=kube-dns
I assume this is a bit of a race that the worker nodes need to be started and connected - so let's try to start them. Now I can see the KUBECONFIG path and that the pods are pending to start:
$ export KUBECONFIG=/home/sochat1_llnl_gov/.config/usernetes/master/admin-localhost.kubeconfig
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ bin/kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system coredns-8557665db-t257d 0/1 Pending 0 3m20s
kube-system coredns-8557665db-wbj64 0/1 Pending 0 3m20s
But also note it's probably concerning that the master isn't registering as a node (should it?) I think so, unless it's just serving as an empty sort of control plane?
$ bin/kubectl get nodes
No resources found
Let's try starting the other two nodes (from the docker compose config) on the other nodes. I tried this before, but maybe can get more debug output this time with the system logs on those nodes. Here is the second node that should start crio, e.g.,
echo "I am compute node ${nodename} going to run crio"
# 10250/tcp: kubelet, 8472/udp: flannel
/bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.101.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=crio
sudo loginctl enable-linger
That doesn't change the state on the master node - still no nodes and pods in pending. Third node is containerd, and again no issues / obvious errors here:
# 10250/tcp: kubelet, 8472/udp: flannel
/bin/bash ./install.sh --wait-init-certs --start=u7s-node.target --cidr=10.0.102.0/24 --publish=0.0.0.0:10250:10250/tcp --publish=0.0.0.0:8472:8472/udp --cni=flannel --cri=containerd
sudo loginctl enable-linger
no change in status on the master - no nodes and pods in pending. But the logs might tell a different story - here is containerd:
What I see there is that it can't find gffw-compute-a-001. Note that I can at least ping the node:
$ ping gffw-compute-a-001
PING gffw-compute-a-001.c.llnl-flux.internal (10.10.0.5) 56(84) bytes of data.
64 bytes from gffw-compute-a-001.c.llnl-flux.internal (10.10.0.5): icmp_seq=1 ttl=64 time=0.602 ms
But hmm - if that's a service, possibly that error about the service we saw above:
100 instance.go:388] Could not construct pre-rendered responses for ServiceAccountIssuerDiscovery endpoints. Endpoints will not be enabled. Error: issuer URL must use https scheme, got: kubernetes.default.svc
if that is the "discovery endpoint" indeed it cannot be discovered! For completeness let's also inspect the crio node (002) - will put this in another comment because this one is too long :)
So not really knowing how this all works - what I think is happening is that the pods are being created, but the different nodes don't seem to be discovering one another. I'm guessing the non-master nodes need to do some kind of handshake with the master, and perhaps there are other layers in there like auth / certificates that are wonky. What should we look at next? Could this be an issue of ports / firewalls not being open perhaps?
For reference, here are the firewall rules I thought the last one for the traffic between nodes would be sufficient - it allows flux to connect on port 8050 (for the workers to ping the broker). I'll read more into the erros above / look them up tomorrow!
bin/kubectl get nodes No resources found
this is because the kubelet is not registering on the apiserver,the next step is to check the kubelet logs
r. I'm guessing the non-master nodes need to do some kind of handshake with the master
the Node object is created by the kubelet in each node, kubelet connects to the apiservers and registers itself, creating the Node object to reflect the node where it is running and its capabilities.
Do you know where these are for usernetes? I did a find for anything named "log" or *.log and I don't see anything in /var/log (which makes sense if it's in user space). I'll keep looking in the usernetes in user home - I'm thinking it should be somewhere in there.
I found what could be logs, but the directories are empty
$ echo $PWD
/home/sochat1_llnl_gov/.local/share/usernetes
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ tree .
.
├── etcd
│ └── member
│ ├── snap
│ │ └── db
│ └── wal
│ ├── 0000000000000000-0000000000000000.wal
│ └── 0.tmp
├── _var_cache
├── _var_lib_cni
├── _var_lib_containers
├── _var_lib_kubelet
└── _var_log
In case these are helpful:
$ ./rootlessctl.sh list-ports
ID PROTO PARENTIP PARENTPORT CHILDIP CHILDPORT
1 tcp 0.0.0.0 2379 2379
2 tcp 0.0.0.0 6443 6443
$ bin/kubectl --kubeconfig ~/.config/usernetes/master/admin-localhost.kubeconfig version
WARNING: This version information is deprecated and will be replaced with the output from kubectl version --short. Use --output=yaml|json to get the full version.
Client Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:20:07Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v5.0.1
Server Version: version.Info{Major:"1", Minor:"27", GitVersion:"v1.27.2", GitCommit:"7f6f68fdabc4df88cfea2dcf9a19b2b830f1e647", GitTreeState:"clean", BuildDate:"2023-05-17T14:13:28Z", GoVersion:"go1.20.4", Compiler:"gc", Platform:"linux/amd64"}
$ bin/kubectl --kubeconfig ~/.config/usernetes/master/admin-localhost.kubeconfig cluster-info
Kubernetes control plane is running at https://127.0.0.1:6443
CoreDNS is running at https://127.0.0.1:6443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
$ curl -k https://127.0.0.1:6443
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "Unauthorized",
"reason": "Unauthorized",
"code": 401
}
$ bin/kubectl --kubeconfig ~/.config/usernetes/master/admin-localhost.kubeconfig get endpoints
NAME ENDPOINTS AGE
kubernetes 10.10.0.5:6443 18m
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ bin/kubectl --kubeconfig ~/.config/usernetes/master/admin-localhost.kubeconfig get endpoints -o yaml
apiVersion: v1
items:
- apiVersion: v1
kind: Endpoints
metadata:
creationTimestamp: "2023-08-05T18:30:34Z"
labels:
endpointslice.kubernetes.io/skip-mirror: "true"
name: kubernetes
namespace: default
resourceVersion: "74"
uid: a6e04b81-f695-4811-b1ab-33360096d21f
subsets:
- addresses:
- ip: 10.10.0.5
ports:
- name: https
port: 6443
protocol: TCP
kind: List
metadata:
resourceVersion: ""
Just run kubelet manually with the configuration and paste the output
okay I've never done that, but it's probably in the code somewhere and I can try to find it!
okay boot/kubelet.sh seems to start that, but it also seems to be only be run in the context of the containerd /crio nodes (the other two). This is a grep that includes the local usernetes and .config directory:
boot/kubelet-crio.sh:exec $(dirname $0)/kubelet.sh --container-runtime-endpoint unix://$XDG_RUNTIME_DIR/usernetes/crio/crio.sock $@
boot/kubelet-containerd.sh:exec $(dirname $0)/kubelet.sh --container-runtime-endpoint unix://$XDG_RUNTIME_DIR/usernetes/containerd/containerd.sock $@
Binary file bin/kubelet matches
If I naively run it (not knowing the args that go into it) I see:
STARTING KUBELET
[INFO] Entering RootlessKit namespaces: OK
I0805 22:30:04.701451 183 server.go:415] "Kubelet version" kubeletVersion="v1.27.2"
I0805 22:30:04.701494 183 server.go:417] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0805 22:30:04.703112 183 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/home/sochat1_llnl_gov/.config/usernetes/node/ca.pem"
I0805 22:30:04.711861 183 server.go:662] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
I0805 22:30:04.712014 183 container_manager_linux.go:266] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
I0805 22:30:04.712077 183 container_manager_linux.go:271] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/home/sochat1_llnl_gov/.local/share/usernetes/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.03} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] TopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] PodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms TopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
I0805 22:30:04.712103 183 topology_manager.go:136] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
I0805 22:30:04.712118 183 container_manager_linux.go:302] "Creating device plugin manager"
I0805 22:30:04.712340 183 state_mem.go:36] "Initialized new in-memory state store"
I0805 22:30:04.912829 183 server.go:776] "Failed to ApplyOOMScoreAdj" err="write /proc/self/oom_score_adj: permission denied"
W0805 22:30:04.913229 183 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
"Addr": "/run/containerd/containerd.sock",
"ServerName": "/run/containerd/containerd.sock",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory"
E0805 22:30:04.913937 183 run.go:74] "command failed" err="failed to run Kubelet: validate service connection: validate CRI v1 runtime API for endpoint \"unix:///run/containerd/containerd.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/containerd/containerd.sock: connect: no such file or directory\""
But I suspect whatever args I need are not there, and also it's not clear to me this is actually called anywhere for the master? Poking around install.sh, I think what is called for master is stuff/services in the group called u7s-master.target
and that calls each of:
I think the API server is started - we confirmed that endpoint above. I tried all three of those, and it says the ports are already being used (and the previous logs showed they were active too).
Unrelated, but I found the plugins directory that was checked for (and didn't exist) I'll add a note in my script to create it:
volumePluginDir: /home/sochat1_llnl_gov/.local/share/usernetes/kubelet-plugins-exec
I'm guessing container.d is expected to be running for the kubelet, if that error is correct? But for the docker-compose setup that variable is left empty, e.g.,:
/bin/bash ./install.sh --wait-init-certs --start=u7s-master-with-etcd.target --cidr=10.0.100.0/24 --publish=0.0.0.0:2379:2379/tcp --publish=0.0.0.0:6443:6443/tcp --cni=flannel --cri=
From the output, it's not clear how we would run a kubelet without containerd, but I'm not experienced with this setup so just a speculation!
u7s-kubelet-containerd.service loaded inactive dead Usernetes kubelet service (containerd)
I tried running the command that would use/create that socket
$ mkdir -p /run/user/501043911/usernetes/containerd/
$ cd usernetes
$ U7S_BASE_DIR=$PWD
$ source $U7S_BASE_DIR/common/common.inc.sh
$ ./boot/kubelet.sh --container-runtime-endpoint unix://$XDG_RUNTIME_DIR/usernetes/containerd/containerd.sock
STARTING KUBELET
--container-runtime-endpoint unix:///run/user/501043911/usernetes/containerd/containerd.sock
[INFO] Entering RootlessKit namespaces: OK
Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
I0805 22:44:40.975865 401 server.go:415] "Kubelet version" kubeletVersion="v1.27.2"
I0805 22:44:40.975909 401 server.go:417] "Golang settings" GOGC="" GOMAXPROCS="" GOTRACEBACK=""
I0805 22:44:40.977780 401 dynamic_cafile_content.go:157] "Starting controller" name="client-ca-bundle::/home/sochat1_llnl_gov/.config/usernetes/node/ca.pem"
I0805 22:44:40.985759 401 server.go:662] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
I0805 22:44:40.985916 401 container_manager_linux.go:266] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
I0805 22:44:40.985964 401 container_manager_linux.go:271] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: KubeletOOMScoreAdj:-999 ContainerRuntime: CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/home/sochat1_llnl_gov/.local/share/usernetes/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.03} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] CPUManagerPolicy:none CPUManagerPolicyOptions:map[] TopologyManagerScope:container CPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] PodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms TopologyManagerPolicy:none ExperimentalTopologyManagerPolicyOptions:map[]}
I0805 22:44:40.985983 401 topology_manager.go:136] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
I0805 22:44:40.986001 401 container_manager_linux.go:302] "Creating device plugin manager"
I0805 22:44:40.986248 401 state_mem.go:36] "Initialized new in-memory state store"
I0805 22:44:41.186765 401 server.go:776] "Failed to ApplyOOMScoreAdj" err="write /proc/self/oom_score_adj: permission denied"
W0805 22:44:41.187310 401 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
"Addr": "/run/user/501043911/usernetes/containerd/containerd.sock",
"ServerName": "/run/user/501043911/usernetes/containerd/containerd.sock",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial unix /run/user/501043911/usernetes/containerd/containerd.sock: connect: no such file or directory"
E0805 22:44:41.190001 401 run.go:74] "command failed" err="failed to run Kubelet: validate service connection: validate CRI v1 runtime API for endpoint \"unix:///run/user/501043911/usernetes/containerd/containerd.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/user/501043911/usernetes/containerd/containerd.sock: connect: no such file or directory\""
Connection to compute.1353515447032707346 closed.
Seems like it's expected to already be there? This might also be important:
Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.
Okay I added that variable to the kubelet config generated in boot/kublet.sh in $XDG_RUNTIME_DIR/usernetes/kubelet-config.yaml
kind: KubeletConfiguration
apiVersion: kubelet.config.k8s.io/v1beta1
volumePluginDir: $XDG_DATA_HOME/usernetes/kubelet-plugins-exec
authentication:
anonymous:
enabled: false
x509:
clientCAFile: "$XDG_CONFIG_HOME/usernetes/node/ca.pem"
tlsCertFile: "$XDG_CONFIG_HOME/usernetes/node/node.pem"
tlsPrivateKeyFile: "$XDG_CONFIG_HOME/usernetes/node/node-key.pem"
clusterDomain: "cluster.local"
clusterDNS:
- "10.0.0.53"
failSwapOn: false
featureGates:
KubeletInUserNamespace: true
evictionHard:
nodefs.available: "3%"
+ containerRuntimeEndpoint: "unix://$XDG_RUNTIME_DIR/usernetes/containerd/containerd.sock"
localStorageCapacityIsolation: false
cgroupDriver: "cgroupfs"
cgroupsPerQOS: true
enforceNodeAllocatable: []
That at least changes the error message to be where we wanted that containerd socket to be, somewhere in the user's control. The error message now references that path:
W0805 22:51:04.038374 436 logging.go:59] [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {
"Addr": "/run/user/501043911/usernetes/containerd/containerd.sock",
"ServerName": "/run/user/501043911/usernetes/containerd/containerd.sock",
"Attributes": null,
"BalancerAttributes": null,
"Type": 0,
"Metadata": null
}. Err: connection error: desc = "transport: Error while dialing dial unix /run/user/501043911/usernetes/containerd/containerd.sock: connect: no such file or directory"
E0805 22:51:04.039080 436 run.go:74] "command failed" err="failed to run Kubelet: validate service connection: validate CRI v1 runtime API for endpoint \"unix:///run/user/501043911/usernetes/containerd/containerd.sock\": rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial unix /run/user/501043911/usernetes/containerd/containerd.sock: connect: no such file or directory\""
Let me see if I can figure out how to generate that and why it's not there.
It looks like the rootlesskit service is supposed to execute the boot/containerd.sh
.config/systemd/user/u7s-rootlesskit.service:ExecStart=/home/sochat1_llnl_gov/usernetes/boot/rootlesskit.sh /home/sochat1_llnl_gov/usernetes/boot/containerd.sh
if I execute it on its own, it tries to use system paths (not user paths)
[sochat1_llnl_gov@gffw-compute-a-001 usernetes]$ ./boot/containerd.sh
INFO[2023-08-05T22:55:18.941105134Z] starting containerd revision=1677a17964311325ed1c31e2c0a3589ce6d5c30d version=v1.7.1
INFO[2023-08-05T22:55:18.956472076Z] loading plugin "io.containerd.content.v1.content"... type=io.containerd.content.v1
INFO[2023-08-05T22:55:18.957045752Z] loading plugin "io.containerd.snapshotter.v1.native"... type=io.containerd.snapshotter.v1
INFO[2023-08-05T22:55:18.957521832Z] loading plugin "io.containerd.snapshotter.v1.overlayfs"... type=io.containerd.snapshotter.v1
INFO[2023-08-05T22:55:18.958340232Z] loading plugin "io.containerd.snapshotter.v1.fuse-overlayfs"... type=io.containerd.snapshotter.v1
INFO[2023-08-05T22:55:18.958579596Z] loading plugin "io.containerd.metadata.v1.bolt"... type=io.containerd.metadata.v1
INFO[2023-08-05T22:55:18.958808670Z] metadata content store policy set policy=shared
INFO[2023-08-05T22:55:18.960625994Z] loading plugin "io.containerd.differ.v1.walking"... type=io.containerd.differ.v1
INFO[2023-08-05T22:55:18.960649370Z] loading plugin "io.containerd.event.v1.exchange"... type=io.containerd.event.v1
INFO[2023-08-05T22:55:18.960660841Z] loading plugin "io.containerd.gc.v1.scheduler"... type=io.containerd.gc.v1
INFO[2023-08-05T22:55:18.960680365Z] loading plugin "io.containerd.lease.v1.manager"... type=io.containerd.lease.v1
INFO[2023-08-05T22:55:18.960692583Z] loading plugin "io.containerd.nri.v1.nri"... type=io.containerd.nri.v1
INFO[2023-08-05T22:55:18.960703504Z] NRI interface is disabled by configuration.
INFO[2023-08-05T22:55:18.960713086Z] loading plugin "io.containerd.runtime.v2.task"... type=io.containerd.runtime.v2
INFO[2023-08-05T22:55:18.960905346Z] loading plugin "io.containerd.runtime.v2.shim"... type=io.containerd.runtime.v2
INFO[2023-08-05T22:55:18.960921039Z] loading plugin "io.containerd.sandbox.store.v1.local"... type=io.containerd.sandbox.store.v1
INFO[2023-08-05T22:55:18.960932550Z] loading plugin "io.containerd.sandbox.controller.v1.local"... type=io.containerd.sandbox.controller.v1
INFO[2023-08-05T22:55:18.960944352Z] loading plugin "io.containerd.streaming.v1.manager"... type=io.containerd.streaming.v1
INFO[2023-08-05T22:55:18.960957261Z] loading plugin "io.containerd.service.v1.introspection-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.960969439Z] loading plugin "io.containerd.service.v1.containers-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.960983195Z] loading plugin "io.containerd.service.v1.content-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.960994223Z] loading plugin "io.containerd.service.v1.diff-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.961006543Z] loading plugin "io.containerd.service.v1.images-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.961018415Z] loading plugin "io.containerd.service.v1.namespaces-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.961029232Z] loading plugin "io.containerd.service.v1.snapshots-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.961039452Z] loading plugin "io.containerd.runtime.v1.linux"... type=io.containerd.runtime.v1
INFO[2023-08-05T22:55:18.961226739Z] loading plugin "io.containerd.monitor.v1.cgroups"... type=io.containerd.monitor.v1
INFO[2023-08-05T22:55:18.961495277Z] loading plugin "io.containerd.service.v1.tasks-service"... type=io.containerd.service.v1
INFO[2023-08-05T22:55:18.961521383Z] loading plugin "io.containerd.grpc.v1.introspection"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961536741Z] loading plugin "io.containerd.transfer.v1.local"... type=io.containerd.transfer.v1
INFO[2023-08-05T22:55:18.961556098Z] loading plugin "io.containerd.internal.v1.restart"... type=io.containerd.internal.v1
INFO[2023-08-05T22:55:18.961601250Z] loading plugin "io.containerd.grpc.v1.containers"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961612893Z] loading plugin "io.containerd.grpc.v1.content"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961624634Z] loading plugin "io.containerd.grpc.v1.diff"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961640201Z] loading plugin "io.containerd.grpc.v1.events"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961656284Z] loading plugin "io.containerd.grpc.v1.healthcheck"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961673180Z] loading plugin "io.containerd.grpc.v1.images"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961696965Z] loading plugin "io.containerd.grpc.v1.leases"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961711651Z] loading plugin "io.containerd.grpc.v1.namespaces"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961727461Z] loading plugin "io.containerd.internal.v1.opt"... type=io.containerd.internal.v1
WARN[2023-08-05T22:55:18.961760550Z] failed to load plugin io.containerd.internal.v1.opt error="mkdir /opt/containerd: permission denied"
INFO[2023-08-05T22:55:18.961775618Z] loading plugin "io.containerd.grpc.v1.sandbox-controllers"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961792023Z] loading plugin "io.containerd.grpc.v1.sandboxes"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961806697Z] loading plugin "io.containerd.grpc.v1.snapshots"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961821318Z] loading plugin "io.containerd.grpc.v1.streaming"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961837177Z] loading plugin "io.containerd.grpc.v1.tasks"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961854777Z] loading plugin "io.containerd.grpc.v1.transfer"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961869432Z] loading plugin "io.containerd.grpc.v1.version"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.961883475Z] loading plugin "io.containerd.grpc.v1.cri"... type=io.containerd.grpc.v1
INFO[2023-08-05T22:55:18.962044746Z] Start cri plugin with config {PluginConfig:{ContainerdConfig:{Snapshotter:fuse-overlayfs DefaultRuntimeName:crun DefaultRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:} UntrustedWorkloadRuntime:{Type: Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:} Runtimes:map[crun:{Type:io.containerd.runc.v2 Path: Engine: PodAnnotations:[] ContainerAnnotations:[] Root: Options:map[BinaryName:crun] PrivilegedWithoutHostDevices:false PrivilegedWithoutHostDevicesAllDevicesAllowed:false BaseRuntimeSpec: NetworkPluginConfDir: NetworkPluginMaxConfNum:0 Snapshotter: SandboxMode:podsandbox}] NoPivot:false DisableSnapshotAnnotations:true DiscardUnpackedLayers:false IgnoreBlockIONotEnabledErrors:false IgnoreRdtNotEnabledErrors:false} CniConfig:{NetworkPluginBinDir:/opt/cni/bin NetworkPluginConfDir:/etc/cni/net.d NetworkPluginMaxConfNum:1 NetworkPluginSetupSerially:false NetworkPluginConfTemplate: IPPreference:} Registry:{ConfigPath: Mirrors:map[] Configs:map[] Auths:map[] Headers:map[]} ImageDecryption:{KeyModel:node} DisableTCPService:true StreamServerAddress:127.0.0.1 StreamServerPort:0 StreamIdleTimeout:4h0m0s EnableSelinux:false SelinuxCategoryRange:1024 SandboxImage:registry.k8s.io/pause:3.8 StatsCollectPeriod:10 SystemdCgroup:false EnableTLSStreaming:false X509KeyPairStreaming:{TLSCertFile: TLSKeyFile:} MaxContainerLogLineSize:16384 DisableCgroup:false DisableApparmor:true RestrictOOMScoreAdj:true MaxConcurrentDownloads:3 DisableProcMount:false UnsetSeccompProfile: TolerateMissingHugetlbController:true DisableHugetlbController:true DeviceOwnershipFromSecurityContext:false IgnoreImageDefinedVolumes:false NetNSMountsUnderStateDir:false EnableUnprivilegedPorts:false EnableUnprivilegedICMP:false EnableCDI:false CDISpecDirs:[/etc/cdi /var/run/cdi] ImagePullProgressTimeout:1m0s DrainExecSyncIOTimeout:0s} ContainerdRootDir:/home/sochat1_llnl_gov/.local/share/usernetes/containerd ContainerdEndpoint:/run/user/501043911/usernetes/containerd/containerd.sock RootDir:/home/sochat1_llnl_gov/.local/share/usernetes/containerd/io.containerd.grpc.v1.cri StateDir:/run/user/501043911/usernetes/containerd/io.containerd.grpc.v1.cri}
INFO[2023-08-05T22:55:18.962080032Z] Connect containerd service
INFO[2023-08-05T22:55:18.962104066Z] using legacy CRI server
INFO[2023-08-05T22:55:18.962112969Z] using experimental NRI integration - disable nri plugin to prevent this
INFO[2023-08-05T22:55:18.962138243Z] Get image filesystem path "/home/sochat1_llnl_gov/.local/share/usernetes/containerd/io.containerd.snapshotter.v1.fuse-overlayfs"
WARN[2023-08-05T22:55:18.962391475Z] failed to load plugin io.containerd.grpc.v1.cri error="failed to create CRI service: failed to create cni conf monitor for default: failed to create the parent of the cni conf dir=/etc/cni: mkdir /etc/cni: permission denied"
INFO[2023-08-05T22:55:18.962408699Z] loading plugin "io.containerd.tracing.processor.v1.otlp"... type=io.containerd.tracing.processor.v1
INFO[2023-08-05T22:55:18.962423577Z] skip loading plugin "io.containerd.tracing.processor.v1.otlp"... error="no OpenTelemetry endpoint: skip plugin" type=io.containerd.tracing.processor.v1
INFO[2023-08-05T22:55:18.962434748Z] loading plugin "io.containerd.internal.v1.tracing"... type=io.containerd.internal.v1
INFO[2023-08-05T22:55:18.962444872Z] skipping tracing processor initialization (no tracing plugin) error="no OpenTelemetry endpoint: skip plugin"
containerd: failed to get listener for main ttrpc endpoint: chown /run/user/501043911/usernetes/containerd/containerd.sock.ttrpc: operation not permitted
When I wrap in rootlesskit.sh it doesn't work because it says there is already a lock file in my user XDG home. But after running the above, at least there are files in that location for containerd:
$ tree /run/user/501043911/usernetes/containerd/
/run/user/501043911/usernetes/containerd/
├── io.containerd.runtime.v1.linux
└── io.containerd.runtime.v2.task
Maybe that gives you some hints?
okay going for bike ride / run, shutting this down for now! Thanks for helping on a Saturday!
Heyo! Wanted to ping for next week and see if anyone @aojea or @AkihiroSuda had thoughts about the above? What should we try next? If you don't have ideas I could take a shot at PR to try and generalize the scripts to not be hard coded for docker-compose (in case there is some tiny error in there leading to the behavior here). Happy Sunday!
Hi! I have a simple question I didn't see obviously in the README or doing a quick search - does usernetes support multiple hosts or does it assume running on one host? I saw it is using
slirp4netns.
which seems to be the same (and thus might result in the same problem) as I'm hitting with k3s https://github.com/k3s-io/k3s/discussions/7615#discussioncomment-6016006. I also see that k3s uses usernetes? So maybe it's exactly the same problem!Thanks for your help!