Open fntlnz opened 3 years ago
/schedule
I think we have two roads for this.
@leodido suggested that we can make the process dumpable (better than making dumpable system wide echo 1 > /proc/sys/fs/suid_dumpable
prctl(PR_SET_DUMPABLE, 1, 0, 0, 0)
Not sure about the security implications of this just yet
OR
Maybe it's a crazy and wrong idea but what if we use our seccomp notify mechanism to pass an unprivileged file handle and return that instead of the actual one in the proc filesystem?
Maybe it's a crazy and wrong idea but what if we use our seccomp notify mechanism to pass an unprivileged file handle and return that instead of the actual one in the proc filesystem?
Probably a stupid question, but would that require SECCOMP_NOTIFY_IOCTL_ADDFD
?
I think we have two roads for this.
@leodido suggested that we can make the process dumpable (better than making dumpable system wide
echo 1 > /proc/sys/fs/suid_dumpable
prctl(PR_SET_DUMPABLE, 1, 0, 0, 0)
Not sure about the security implications of this just yet
OR
Maybe it's a crazy and wrong idea but what if we use our seccomp notify mechanism to pass an unprivileged file handle and return that instead of the actual one in the proc filesystem?
Yup! This code in the Kernel could be the reason why that write
fails.
First attempt would be to play with the sysctl
knob (/proc/sys/fs/suid_dumpable
at the moment is set to 2
).
A quick update on the progress made by me and @fntlnz (also thanks to @csweichel 🤗 ).
^^^ These are the settings for /etc/subuid
and /etc/subgid
we used.
Then inspecting with strace
:
[pid 16099] openat(AT_FDCWD, "/dev/kmsg", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid 16099] openat(AT_FDCWD, "/etc/localtime", O_RDONLY) = 7
[pid 16099] write(2, "\33[31mFATA\33[0m[2021-07-21T16:00:0"..., 92FATA[2021-07-21T16:00:01.195076930Z] no such file or directory
) = 92
Creating /dev/kmsg
(in the host) made us progress further.
The next error is about permissions on /root/.rancher/...
. Specifying a directory (or playing with symlinks) in /workspace
removes such an error.
So, we got k3s
to start in rootless mode. But it's not properly functioning yet.
For example:
gitpod /workspace/gitpod $ sudo k3s kubectl get pods --all-namespaces
Unable to connect to the server: x509: certificate signed by unknown authority
/assign @fntlnz
/assign @leodido
We (mainly @leodido) managed to get a bit further on this journey. The next issue is the recent change in runc
, see https://github.com/gitpod-io/gitpod/issues/5124.
Complete list of steps:
curl -sfL https://get.k3s.io | sh -
/dev/kmsg
file in the workspace. This must be done as root from the node, but can be empty. A simple touch
is enough.newuidmap
and mount.fuse3
is present
sudo apt-get update
sudo apt-get install fuse3 uidmap
/etc/sub*id
fall within /proc/self/uid_map
/gid_map
resp. E.g. by changing /etc/subuid
and /etc/subgid
to
gitpod:1000:1000
sudo unshare -C bash
su gitpod
/workspace
- the shiftfs
mount interferes with the kubelet
mkdir /workspace/kubelet
sudo mount --rbind /workspace/kubelet /home/gitpod/.rancher/k3s/agent/kubelet
XDG_RUNTIME_DIR
to something sensible
export XDG_RUNTIME_DIR=/workspace/k3s/config
mkdir -p -m 700 $XDG_RUNTIME_DIR
k3s server --rootless --snapshotter=fuse-overlayfs --debug
To do not have k3s
installation error out run it like so:
curl -sfL https://get.k3s.io | INSTALL_K3S_SKIP_ENABLE=true INSTALL_K3S_SKIP_START=true INSTALL_K3S_SYMLINK=skip sh -
To get flooded (😄 ) by debug logs use:
k3s server --rootless --snapshotter=fuse-overlayfs --debug -v 10
We've tested the steps above on a branch with #5139 (the runc proc mount fix), and have encountered the following error:
{"args":["/app/nsinsider","move-mount","--target","/run/k3s/containerd/io.containerd.runtime.v2.task/k8s.io/8af8c2bf9a4d742b5110810bd549dd8731936d25df6d25baf715fec19fc651ca/rootfs/proc","--pipe-fd","3"],"instanceId":"48fb29e2-fdb3-434a-9133-86c8c9d3e250","level":"fatal","message":"permission denied","serviceContext":{"service":"nsinsider","version":""},"severity":"CRITICAL","time":"2021-08-10T18:46:00Z"}
It's really odd that the move_mount
syscall would fail with permission denied
considering it's run as root. According to the move_mount
man page this happens when:
To select a mount object, no permissions are required on the object referred to by the path, but execute (search) permission is required on all of the directories in pathname that lead to the object.
respectively
Search permission is denied for one of the directories in the path prefix of pathname .
Considering there's nothing inherently special about root
, it's conceivable that the target path does not grant execute permissions to world, hence the EPERM
. This is hard to debug because the paths are very short lived.
Any progress/update on this issue? cc @leodido @csweichel
I am not sure who's working on this ATM. Maybe remove the groundwork: in progress
label?
@csweichel Sorry to bother, I've been trying to follow the steps from https://github.com/gitpod-io/gitpod/issues/4889#issuecomment-895385399 and I got stuck at step 2: how do I "sudo touch /dev/kmsg" from the node level? Do I need to setup some magic before that?
No bother at all. You'd need to run this experiment on a Gitpod installation where you have access the cluster Gitpod is running on. Once your workspace is running, you can
ws-daemon
on the machine your workspace is running on (kubectl get pod -o yaml
and kubectl describe node
come in very handy here)sleep 1234
in the workspace, and then look at the process table from within ws-daemonnsenter -t <sleepPID or workspace supervisor PID> -m touch /dev/kmesg
You cannot do this from within the workspace because there /dev
is a bind mount to /dev
of the workspace's Kubernetes pod, hence you don't have permission for this operation.
@csweichel with the new k3s clusters now I get
$ sudo k3s server --rootless --snapshotter=fuse-overlayfs --debug -v 10
INFO[0000] Acquiring lock file /var/lib/rancher/k3s/data/.lock
INFO[0000] Preparing data dir /var/lib/rancher/k3s/data/9d8f9670e1bff08a901bc7bc270202323f7c2c716a89a73d776c363ac1971018
DEBU[0001] Verified hash aux/ebtablesd is correct
DEBU[0001] Verified hash aux/iptables-detect.sh is correct
DEBU[0001] Verified hash aux/xtables-set-mode.sh is correct
DEBU[0001] Verified hash charon is correct
DEBU[0001] Verified hash slirp4netns is correct
DEBU[0001] Verified hash socat is correct
DEBU[0001] Verified hash swanctl is correct
DEBU[0001] Verified hash aux/ebtables-legacy is correct
DEBU[0001] Verified hash conntrack is correct
DEBU[0001] Verified hash containerd is correct
DEBU[0001] Verified hash ip is correct
DEBU[0001] Verified hash ipset is correct
DEBU[0001] Verified hash aux/ebtablesu is correct
DEBU[0001] Verified hash aux/xtables-legacy-multi is correct
DEBU[0001] Verified hash blkid is correct
DEBU[0001] Verified hash find is correct
DEBU[0001] Verified hash cni is correct
DEBU[0001] Verified hash containerd-shim is correct
DEBU[0001] Verified hash aux/wg-add.sh is correct
DEBU[0001] Verified hash aux/xtables-nft-multi is correct
DEBU[0001] Verified hash coreutils is correct
DEBU[0001] Verified hash ethtool is correct
DEBU[0001] Verified hash runc is correct
DEBU[0001] Verified hash aux/iptables-apply is correct
DEBU[0001] Verified hash fuse-overlayfs is correct
DEBU[0001] Verified hash losetup is correct
DEBU[0001] Verified hash pigz is correct
DEBU[0001] Verified hash containerd-shim-runc-v2 is correct
DEBU[0001] Verified hash nsenter is correct
DEBU[0001] Verified hash busybox is correct
DEBU[0001] Verified hash check-config is correct
DEBU[0001] Verified link tee is correct
DEBU[0001] Verified link aux/ip6tables-legacy-restore is correct
DEBU[0001] Verified link head is correct
DEBU[0001] Verified link sync is correct
DEBU[0001] Verified link aux/ip6tables is correct
DEBU[0001] Verified link expand is correct
DEBU[0001] Verified link fdformat is correct
DEBU[0001] Verified link strings is correct
DEBU[0001] Verified link udhcpc is correct
DEBU[0001] Verified link aux/iptables-legacy-save is correct
DEBU[0001] Verified link dirname is correct
DEBU[0001] Verified link setserial is correct
DEBU[0001] Verified link ctr is correct
DEBU[0001] Verified link deallocvt is correct
DEBU[0001] Verified link delgroup is correct
DEBU[0001] Verified link fsfreeze is correct
DEBU[0001] Verified link k3s-server is correct
DEBU[0001] Verified link aux/ebtables-nft-restore is correct
DEBU[0001] Verified link crond is correct
DEBU[0001] Verified link csplit is correct
DEBU[0001] Verified link microcom is correct
DEBU[0001] Verified link ptx is correct
DEBU[0001] Verified link wget is correct
DEBU[0001] Verified link timeout is correct
DEBU[0001] Verified link usleep is correct
DEBU[0001] Verified link fmt is correct
DEBU[0001] Verified link fsck is correct
DEBU[0001] Verified link lzcat is correct
DEBU[0001] Verified link addgroup is correct
DEBU[0001] Verified link aux/iptables-legacy-restore is correct
DEBU[0001] Verified link i2cdetect is correct
DEBU[0001] Verified link freeramdisk is correct
DEBU[0001] Verified link nslookup is correct
DEBU[0001] Verified link sysctl is correct
DEBU[0001] Verified link sha512sum is correct
DEBU[0001] Verified link yes is correct
DEBU[0001] Verified link cat is correct
DEBU[0001] Verified link mdev is correct
DEBU[0001] Verified link pivot_root is correct
DEBU[0001] Verified link mkdosfs is correct
DEBU[0001] Verified link time is correct
DEBU[0001] Verified link [[ is correct
DEBU[0001] Verified link aux/iptables-nft-save is correct
DEBU[0001] Verified link linuxrc is correct
DEBU[0001] Verified link i2cget is correct
DEBU[0001] Verified link i2ctransfer is correct
DEBU[0001] Verified link nameif is correct
DEBU[0001] Verified link sleep is correct
DEBU[0001] Verified link sum is correct
DEBU[0001] Verified link adduser is correct
DEBU[0001] Verified link arch is correct
DEBU[0001] Verified link fbset is correct
DEBU[0001] Verified link uudecode is correct
DEBU[0001] Verified link setpriv is correct
DEBU[0001] Verified link vconfig is correct
DEBU[0001] Verified link vi is correct
DEBU[0001] Verified link more is correct
DEBU[0001] Verified link mv is correct
DEBU[0001] Verified link watchdog is correct
DEBU[0001] Verified link pathchk is correct
DEBU[0001] Verified link printenv is correct
DEBU[0001] Verified link rdate is correct
DEBU[0001] Verified link run-init is correct
DEBU[0001] Verified link stat is correct
DEBU[0001] Verified link aux/ip6tables-nft-save is correct
DEBU[0001] Verified link crontab is correct
DEBU[0001] Verified link du is correct
DEBU[0001] Verified link login is correct
DEBU[0001] Verified link lzma is correct
DEBU[0001] Verified link rmdir is correct
DEBU[0001] Verified link killall is correct
DEBU[0001] Verified link mkpasswd is correct
DEBU[0001] Verified link od is correct
DEBU[0001] Verified link wc is correct
DEBU[0001] Verified link whoami is correct
DEBU[0001] Verified link crictl is correct
DEBU[0001] Verified link eject is correct
DEBU[0001] Verified link getty is correct
DEBU[0001] Verified link watch is correct
DEBU[0001] Verified link aux/ebtables-restore is correct
DEBU[0001] Verified link hostname is correct
DEBU[0001] Verified link mim is correct
DEBU[0001] Verified link chmod is correct
DEBU[0001] Verified link hostid is correct
DEBU[0001] Verified link ifconfig is correct
DEBU[0001] Verified link setlogcons is correct
DEBU[0001] Verified link ash is correct
DEBU[0001] Verified link aux/iptables-restore is correct
DEBU[0001] Verified link aux/iptables-restore-translate is correct
DEBU[0001] Verified link ubirename is correct
DEBU[0001] Verified link df is correct
DEBU[0001] Verified link nologin is correct
DEBU[0001] Verified link rmmod is correct
DEBU[0001] Verified link bunzip2 is correct
DEBU[0001] Verified link dc is correct
DEBU[0001] Verified link ln is correct
DEBU[0001] Verified link arp is correct
DEBU[0001] Verified link aux/iptables-translate is correct
DEBU[0001] Verified link vlock is correct
DEBU[0001] Verified link split is correct
DEBU[0001] Verified link unlzma is correct
DEBU[0001] Verified link grep is correct
DEBU[0001] Verified link mkfifo is correct
DEBU[0001] Verified link pwd is correct
DEBU[0001] Verified link reset is correct
DEBU[0001] Verified link gunzip is correct
DEBU[0001] Verified link ipcs is correct
DEBU[0001] Verified link ipneigh is correct
DEBU[0001] Verified link groups is correct
DEBU[0001] Verified link paste is correct
DEBU[0001] Verified link setconsole is correct
DEBU[0001] Verified link aux/arptables-restore is correct
DEBU[0001] Verified link aux/ebtables is correct
DEBU[0001] Verified link fallocate is correct
DEBU[0001] Verified link aux/arptables is correct
DEBU[0001] Verified link awk is correct
DEBU[0001] Verified link init is correct
DEBU[0001] Verified link pr is correct
DEBU[0001] Verified link bc is correct
DEBU[0001] Verified link chroot is correct
DEBU[0001] Verified link deluser is correct
DEBU[0001] Verified link inetd is correct
DEBU[0001] Verified link lsmod is correct
DEBU[0001] Verified link aux/arptables-nft is correct
DEBU[0001] Verified link aux/ip6tables-restore-translate is correct
DEBU[0001] Verified link aux/iptables is correct
DEBU[0001] Verified link runlevel is correct
DEBU[0001] Verified link uuencode is correct
DEBU[0001] Verified link fold is correct
DEBU[0001] Verified link host-local is correct
DEBU[0001] Verified link portmap is correct
DEBU[0001] Verified link ar is correct
DEBU[0001] Verified link chown is correct
DEBU[0001] Verified link flannel is correct
DEBU[0001] Verified link nice is correct
DEBU[0001] Verified link nuke is correct
DEBU[0001] Verified link shuf is correct
DEBU[0001] Verified link uevent is correct
DEBU[0001] Verified link dos2unix is correct
DEBU[0001] Verified link install is correct
DEBU[0001] Verified link kubectl is correct
DEBU[0001] Verified link unlink is correct
DEBU[0001] Verified link unpigz is correct
DEBU[0001] Verified link klogd is correct
DEBU[0001] Verified link pinky is correct
DEBU[0001] Verified link run-parts is correct
DEBU[0001] Verified link fuser is correct
DEBU[0001] Verified link halt is correct
DEBU[0001] Verified link join is correct
DEBU[0001] Verified link last is correct
DEBU[0001] Verified link mesg is correct
DEBU[0001] Verified link aux/ebtables-save is correct
DEBU[0001] Verified link chvt is correct
DEBU[0001] Verified link dnsdomainname is correct
DEBU[0001] Verified link mkdir is correct
DEBU[0001] Verified link top is correct
DEBU[0001] Verified link readlink is correct
DEBU[0001] Verified link aux/arptables-save is correct
DEBU[0001] Verified link aux/ip6tables-nft is correct
DEBU[0001] Verified link echo is correct
DEBU[0001] Verified link less is correct
DEBU[0001] Verified link linux64 is correct
DEBU[0001] Verified link which is correct
DEBU[0001] Verified link bridge is correct
DEBU[0001] Verified link dd is correct
DEBU[0001] Verified link partprobe is correct
DEBU[0001] Verified link route is correct
DEBU[0001] Verified link sha3sum is correct
DEBU[0001] Verified link switch_root is correct
DEBU[0001] Verified link tail is correct
DEBU[0001] Verified link [ is correct
DEBU[0001] Verified link dir is correct
DEBU[0001] Verified link iprule is correct
DEBU[0001] Verified link sha224sum is correct
DEBU[0001] Verified link aux/arptables-nft-restore is correct
DEBU[0001] Verified link aux/ebtables-nft is correct
DEBU[0001] Verified link dnsd is correct
DEBU[0001] Verified link linux32 is correct
DEBU[0001] Verified link netstat is correct
DEBU[0001] Verified link tsort is correct
DEBU[0001] Verified link who is correct
DEBU[0001] Verified link aux/iptables-save is correct
DEBU[0001] Verified link dumpkmap is correct
DEBU[0001] Verified link gzip is correct
DEBU[0001] Verified link iproute is correct
DEBU[0001] Verified link makedevs is correct
DEBU[0001] Verified link nl is correct
DEBU[0001] Verified link k3s-agent is correct
DEBU[0001] Verified link k3s-etcd-snapshot is correct
DEBU[0001] Verified link sha256sum is correct
DEBU[0001] Verified link aux/ip6tables-apply is correct
DEBU[0001] Verified link unix2dos is correct
DEBU[0001] Verified link nohup is correct
DEBU[0001] Verified link aux/ip6tables-translate is correct
DEBU[0001] Verified link cksum is correct
DEBU[0001] Verified link comm is correct
DEBU[0001] Verified link tty is correct
DEBU[0001] Verified link getopt is correct
DEBU[0001] Verified link shred is correct
DEBU[0001] Verified link tac is correct
DEBU[0001] Verified link mke2fs is correct
DEBU[0001] Verified link pidof is correct
DEBU[0001] Verified link test is correct
DEBU[0001] Verified link aux/ip6tables-legacy is correct
DEBU[0001] Verified link ether-wake is correct
DEBU[0001] Verified link lsusb is correct
DEBU[0001] Verified link base64 is correct
DEBU[0001] Verified link factor is correct
DEBU[0001] Verified link loadkmap is correct
DEBU[0001] Verified link lzopcat is correct
DEBU[0001] Verified link sed is correct
DEBU[0001] Verified link arping is correct
DEBU[0001] Verified link aux/iptables-legacy is correct
DEBU[0001] Verified link aux/iptables-nft is correct
DEBU[0001] Verified link su is correct
DEBU[0001] Verified link syslogd is correct
DEBU[0001] Verified link resize is correct
DEBU[0001] Verified link true is correct
DEBU[0001] Verified link aux/ip6tables-restore is correct
DEBU[0001] Verified link diff is correct
DEBU[0001] Verified link realpath is correct
DEBU[0001] Verified link aux/modprobe is correct
DEBU[0001] Verified link reboot is correct
DEBU[0001] Verified link stty is correct
DEBU[0001] Verified link mountpoint is correct
DEBU[0001] Verified link patch is correct
DEBU[0001] Verified link sha384sum is correct
DEBU[0001] Verified link sulogin is correct
DEBU[0001] Verified link xzcat is correct
DEBU[0001] Verified link aux/mount is correct
DEBU[0001] Verified link chcon is correct
DEBU[0001] Verified link killall5 is correct
DEBU[0001] Verified link sha1sum is correct
DEBU[0001] Verified link zcat is correct
DEBU[0001] Verified link aux/arptables-nft-save is correct
DEBU[0001] Verified link cpio is correct
DEBU[0001] Verified link rm is correct
DEBU[0001] Verified link env is correct
DEBU[0001] Verified link loopback is correct
DEBU[0001] Verified link mktemp is correct
DEBU[0001] Verified link ps is correct
DEBU[0001] Verified link uname is correct
DEBU[0001] Verified link aux/ebtables-nft-save is correct
DEBU[0001] Verified link basenc is correct
DEBU[0001] Verified link egrep is correct
DEBU[0001] Verified link uptime is correct
DEBU[0001] Verified link xxd is correct
DEBU[0001] Verified link ipcrm is correct
DEBU[0001] Verified link traceroute is correct
DEBU[0001] Verified link uniq is correct
DEBU[0001] Verified link fdflush is correct
DEBU[0001] Verified link sh is correct
DEBU[0001] Verified link ts is correct
DEBU[0001] Verified link poweroff is correct
DEBU[0001] Verified link setfattr is correct
DEBU[0001] Verified link setkeycodes is correct
DEBU[0001] Verified link sort is correct
DEBU[0001] Verified link users is correct
DEBU[0001] Verified link xz is correct
DEBU[0001] Verified link b2sum is correct
DEBU[0001] Verified link chgrp is correct
DEBU[0001] Verified link pipe_progress is correct
DEBU[0001] Verified link tftp is correct
DEBU[0001] Verified link unzip is correct
DEBU[0001] Verified link cut is correct
DEBU[0001] Verified link ifdown is correct
DEBU[0001] Verified link insmod is correct
DEBU[0001] Verified link kill is correct
DEBU[0001] Verified link logname is correct
DEBU[0001] Verified link lsattr is correct
DEBU[0001] Verified link truncate is correct
DEBU[0001] Verified link base32 is correct
DEBU[0001] Verified link basename is correct
DEBU[0001] Verified link iplink is correct
DEBU[0001] Verified link ifup is correct
DEBU[0001] Verified link ls is correct
DEBU[0001] Verified link printf is correct
DEBU[0001] Verified link loadfont is correct
DEBU[0001] Verified link nproc is correct
DEBU[0001] Verified link openvt is correct
DEBU[0001] Verified link runcon is correct
DEBU[0001] Verified link vdir is correct
DEBU[0001] Verified link aux/xtables-monitor is correct
DEBU[0001] Verified link dircolors is correct
DEBU[0001] Verified link expr is correct
DEBU[0001] Verified link mt is correct
DEBU[0001] Verified link tr is correct
DEBU[0001] Verified link w is correct
DEBU[0001] Verified link chattr is correct
DEBU[0001] Verified link hwclock is correct
DEBU[0001] Verified link lsscsi is correct
DEBU[0001] Verified link k3s is correct
DEBU[0001] Verified link logger is correct
DEBU[0001] Verified link lspci is correct
DEBU[0001] Verified link numfmt is correct
DEBU[0001] Verified link svc is correct
DEBU[0001] Verified link bzcat is correct
DEBU[0001] Verified link cp is correct
DEBU[0001] Verified link date is correct
DEBU[0001] Verified link telnet is correct
DEBU[0001] Verified link iptunnel is correct
DEBU[0001] Verified link passwd is correct
DEBU[0001] Verified link ping is correct
DEBU[0001] Verified link resume is correct
DEBU[0001] Verified link svok is correct
DEBU[0001] Verified link aux/iptables-nft-restore is correct
DEBU[0001] Verified link false is correct
DEBU[0001] Verified link hdparm is correct
DEBU[0001] Verified link unxz is correct
DEBU[0001] Verified link free is correct
DEBU[0001] Verified link lsof is correct
DEBU[0001] Verified link mknod is correct
DEBU[0001] Verified link setarch is correct
DEBU[0001] Verified link start-stop-daemon is correct
DEBU[0001] Verified link aux/ip6tables-legacy-save is correct
DEBU[0001] Verified link aux/ip6tables-save is correct
DEBU[0001] Verified link cmp is correct
DEBU[0001] Verified link seq is correct
DEBU[0001] Verified link touch is correct
DEBU[0001] Verified link i2cset is correct
DEBU[0001] Verified link id is correct
DEBU[0001] Verified link link is correct
DEBU[0001] Verified link umount is correct
DEBU[0001] Verified link unlzop is correct
DEBU[0001] Verified link clear is correct
DEBU[0001] Verified link fgrep is correct
DEBU[0001] Verified link md5sum is correct
DEBU[0001] Verified link hexedit is correct
DEBU[0001] Verified link i2cdump is correct
DEBU[0001] Verified link ipaddr is correct
DEBU[0001] Verified link tar is correct
DEBU[0001] Verified link unexpand is correct
DEBU[0001] Verified link aux/ip6tables-nft-restore is correct
DEBU[0001] Verified link chrt is correct
DEBU[0001] Verified link devmem is correct
DEBU[0001] Asset dir /var/lib/rancher/k3s/data/9d8f9670e1bff08a901bc7bc270202323f7c2c716a89a73d776c363ac1971018
DEBU[0001] Running /var/lib/rancher/k3s/data/9d8f9670e1bff08a901bc7bc270202323f7c2c716a89a73d776c363ac1971018/bin/k3s-server [k3s server --rootless --snapshotter=fuse-overlayfs --debug -v 10]
DEBU[2021-12-01T23:41:49.474793550Z] Running rootless parent
FATA[2021-12-01T23:41:49.475061846Z] expected sysctl value "net.ipv4.ip_forward" to be "1", got "0"; try adding "net.ipv4.ip_forward=1" to /etc/sysctl.conf and running `sudo sysctl --system`
I ran the same command in a workspace with experimentalNetwork: true
and got past that point. There's heaps of debug output:
After some small changes I was able to have k3s up & running. Not sure if natively though. Available here
With recent cgroup v2 fixes, I figured I give this another try. On a machine with cgroup v2 enabled, I ran the k3s server without agent using the latest release, I started an agent separately using a custom build from https://github.com/k3s-io/k3s/commit/13728058a4e997d8e6168f473299918394f446ef to include the cgroup changes.
This got me a step closer, but it's still not quite working:
./gitpod/k3s agent -d /workspace/k3s_agent --token-file /workspace/k3s/server/token -s https://10.0.2.100:6443 --lb-server-port 6445 --node-ip 10.0.2.100 --with-node-id
INFO[0000] Starting k3s agent dev (HEAD)
INFO[0000] Running load balancer 127.0.0.1:6445 -> [10.0.2.100:6443]
INFO[0000] Module overlay was already loaded
INFO[0000] Module nf_conntrack was already loaded
INFO[0000] Module br_netfilter was already loaded
INFO[0000] Module iptable_nat was already loaded
INFO[0000] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400
ERRO[0000] Failed to set sysctl: open /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: read-only file system
INFO[0000] Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600
ERRO[0000] Failed to set sysctl: open /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait: read-only file system
WARN[0000] cgroup v2 controllers are not delegated for rootless. Disabling cgroup.
INFO[0000] Logging containerd to /workspace/k3s_agent/agent/containerd/containerd.log
INFO[0000] Running containerd -c /workspace/k3s_agent/agent/etc/containerd/config.toml -a /run/k3s/containerd/containerd.sock --state /run/k3s/containerd --root /workspace/k3s_agent/agent/containerd
W0209 14:56:08.106424 26302 clientconn.go:1331] [core] grpc: addrConn.createTransport failed to connect to {/run/k3s/containerd/containerd.sock /run/k3s/containerd/containerd.sock <nil> 0 <nil>}. Err: connection error: desc = "transport: Error while dialing dial unix /run/k3s/containerd/containerd.sock: connect: no such file or directory". Reconnecting...
INFO[0001] Containerd is now running
INFO[0001] Connecting to proxy url="wss://10.0.2.100:6443/v1-k3s/connect"
INFO[0001] Running kubelet --address=0.0.0.0 --anonymous-auth=false --authentication-token-webhook=true --authorization-mode=Webhook --cgroup-driver=cgroupfs --client-ca-file=/workspace/k3s_agent/agent/client-ca.crt --cloud-provider=external --cluster-dns=10.43.0.10 --cluster-domain=cluster.local --cni-bin-dir=/workspace/k3s_original/data/current/bin --cni-conf-dir=/workspace/k3s_agent/agent/etc/cni/net.d --container-runtime-endpoint=unix:///run/k3s/containerd/containerd.sock --container-runtime=remote --containerd=/run/k3s/containerd/containerd.sock --eviction-hard=imagefs.available<5%,nodefs.available<5% --eviction-minimum-reclaim=imagefs.available=10%,nodefs.available=10% --fail-swap-on=false --feature-gates=DevicePlugins=false --healthz-bind-address=127.0.0.1 --hostname-override=gitpodio-gitpod-zgryzv745nv-0c1b976c --kubeconfig=/workspace/k3s_agent/agent/kubelet.kubeconfig --node-labels= --pod-manifest-path=/workspace/k3s_agent/agent/pod-manifests --read-only-port=0 --resolv-conf=/etc/resolv.conf --serialize-image-pulls=false --tls-cert-file=/workspace/k3s_agent/agent/serving-kubelet.crt --tls-private-key-file=/workspace/k3s_agent/agent/serving-kubelet.key
Flag --cloud-provider has been deprecated, will be removed in 1.24 or later, in favor of removing cloud provider code from Kubelet.
Flag --cni-bin-dir has been deprecated, will be removed along with dockershim.
Flag --cni-conf-dir has been deprecated, will be removed along with dockershim.
Flag --containerd has been deprecated, This is a cadvisor flag that was mistakenly registered with the Kubelet. Due to legacy concerns, it will follow the standard CLI deprecation timeline before being removed.
I0209 14:56:09.132082 26302 server.go:442] "Kubelet version" kubeletVersion="v1.23.3-k3s1"
W0209 14:56:09.133468 26302 manager.go:159] Cannot detect current cgroup on cgroup v2
I0209 14:56:09.133598 26302 dynamic_cafile_content.go:156] "Starting controller" name="client-ca-bundle::/workspace/k3s_agent/agent/client-ca.crt"
INFO[0001] Running kube-proxy --cluster-cidr=10.42.0.0/16 --conntrack-max-per-core=0 --conntrack-tcp-timeout-close-wait=0s --conntrack-tcp-timeout-established=0s --healthz-bind-address=127.0.0.1 --hostname-override=gitpodio-gitpod-zgryzv745nv-0c1b976c --kubeconfig=/workspace/k3s_agent/agent/kubeproxy.kubeconfig --proxy-mode=iptables
I0209 14:56:09.137456 26302 server.go:225] "Warning, all flags other than --config, --write-config-to, and --cleanup are deprecated, please begin using a config file ASAP"
E0209 14:56:09.138079 26302 proxier.go:647] "Failed to read builtin modules file, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" err="open /lib/modules/5.13.0-1013-gcp/modules.builtin: no such file or directory" filePath="/lib/modules/5.13.0-1013-gcp/modules.builtin"
I0209 14:56:09.138444 26302 proxier.go:657] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="ip_vs"
I0209 14:56:09.138686 26302 proxier.go:657] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="ip_vs_rr"
I0209 14:56:09.138928 26302 proxier.go:657] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="ip_vs_wrr"
I0209 14:56:09.139164 26302 proxier.go:657] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="ip_vs_sh"
I0209 14:56:09.139414 26302 proxier.go:657] "Failed to load kernel module with modprobe, you can ignore this message when kube-proxy is running inside container without mounting /lib/modules" moduleName="nf_conntrack"
WARN[0001] Running modprobe ip_vs failed with message: ``, error: exec: "modprobe": executable file not found in $PATH
E0209 14:56:09.149361 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
I0209 14:56:09.214120 26302 server.go:693] "--cgroups-per-qos enabled, but --cgroup-root was not specified. defaulting to /"
I0209 14:56:09.214343 26302 container_manager_linux.go:281] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
I0209 14:56:09.214415 26302 container_manager_linux.go:286] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:remote CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:cgroupfs KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
I0209 14:56:09.214495 26302 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
I0209 14:56:09.214514 26302 container_manager_linux.go:321] "Creating device plugin manager" devicePluginEnabled=false
I0209 14:56:09.214536 26302 state_mem.go:36] "Initialized new in-memory state store"
I0209 14:56:09.415249 26302 server.go:799] "Failed to ApplyOOMScoreAdj" err="write /proc/self/oom_score_adj: permission denied"
I0209 14:56:09.418599 26302 kubelet.go:416] "Attempting to sync node with API server"
I0209 14:56:09.418655 26302 kubelet.go:278] "Adding static pod path" path="/workspace/k3s_agent/agent/pod-manifests"
I0209 14:56:09.418687 26302 kubelet.go:289] "Adding apiserver pod source"
I0209 14:56:09.418730 26302 apiserver.go:42] "Waiting for node sync before watching apiserver pods"
I0209 14:56:09.419770 26302 kuberuntime_manager.go:248] "Container runtime initialized" containerRuntime="containerd" version="v1.5.9-k3s1" apiVersion="v1alpha2"
E0209 14:56:10.264664 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
I0209 14:56:10.419015 26302 apiserver.go:52] "Watching apiserver"
E0209 14:56:12.503021 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
E0209 14:56:17.072636 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
E0209 14:56:25.649130 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
E0209 14:56:44.739768 26302 node.go:152] Failed to retrieve node info: nodes "gitpodio-gitpod-zgryzv745nv-0c1b976c" not found
I0209 14:56:44.739803 26302 server.go:843] "Can't determine this node's IP, assuming 127.0.0.1; if this is incorrect, please set the --bind-address flag"
I0209 14:56:44.739814 26302 server_others.go:138] "Detected node IP" address="127.0.0.1"
I0209 14:56:44.746675 26302 server_others.go:199] "kube-proxy running in single-stack mode, this ipFamily is not supported" ipFamily=IPv6
I0209 14:56:44.746701 26302 server_others.go:206] "Using iptables Proxier"
I0209 14:56:44.747061 26302 server.go:656] "Version info" version="v1.23.3-k3s1"
I0209 14:56:44.948918 26302 config.go:317] "Starting service config controller"
I0209 14:56:44.949046 26302 shared_informer.go:240] Waiting for caches to sync for service config
I0209 14:56:44.948919 26302 config.go:226] "Starting endpoint slice config controller"
I0209 14:56:44.949227 26302 shared_informer.go:240] Waiting for caches to sync for endpoint slice config
I0209 14:56:45.049227 26302 shared_informer.go:247] Caches are synced for service config
I0209 14:56:45.049302 26302 shared_informer.go:247] Caches are synced for endpoint slice config
There were some etcd issues which may be related to the custom k3s build. I started a postgres instance using Docker and used it with:
export K3S_DATASTORE_ENDPOINT='postgres://postgres:mysecretpassword@10.0.2.100:5432/k3s?sslmode=disable'
Also, I ran into https://github.com/k3s-io/k3s/issues/346 which I worked around by setting path to include ${DATA_DIR}/data/current/bin
.
For good measure I tried to modify /etc/hosts
to make the workspace hostname resolve to 10.0.2.100
(workspace tap0
IP) rather than 127.0.0.1
.
Once everything has settled, kubectl get node
remains empty :(
Update: using the userspace proxier and setting the bind address of the kube-proxy made the error messages go away
k3s agent -d /workspace/k3s_p --token-file /workspace/k3s_o/server/node-token -s https://localhost:6443 --lb-server-port 6445 --kube-proxy-arg proxy-mode=userspace --kube-proxy-arg bind-address=10.0.2.100 --debug
We're now in a state where the kubelet comes up and the node registers correctly. However, containers don't start yet, most likely because the kubelet reports InvalidDiskCapacity
.
To make this happen, we had to:
/dev/kmsg
/sys/fs/cgroup
mount writable and accessible to 33333
: we nsenter
'ed the ring2 namespaces from the node, mounted cgroup2
in a tmp directory, chmod
'ed the cgroup hierarchy and rbind-mounted it over /sys/fs/cgroup
.cgroup.subtree_controller
: to this end we created a cgroup y
alongside the actual container cgroup x
, evacuated all processes of the container into that new cgroup, enabled the subtree controller in x
, created a child cgroup x/ring2
, moved the processes from y
to x/ring2
. Now we have a cgroup of type domain
with pids
and other threaded controllers enabled.Note: the "Failed to ensure state" containerName="/k3s" err="failed to apply oom score -999 to PID 21789: write /proc/21789/oom_score_adj: permission denied"
messages seem inconsequential. When we write the corresponding oom_score_adj
file, the messages stop but scheduling does not improve.
Notes for future implementation:
cgroup evacuation: ring1 asks ws-daemon to prepare the cgroup structure, which:
<container-group>/workspace
<container-group>/workspace/user - ring2 stuff and below
<container-group>/workspace/gitpod - ring1
<container-group>/workspace/gitpod
, ring2 gets moved to <container-group>/workspace/user
<container-group>/workspace
to gitpod:gitpod and <container-group>/workspace/gitpod
to root:root
<container-group>/workspace
.nsdelegate
means we don't have to chown
BEWARE: this is pretty (very extremely) far from production ready. It's entirely unclear which features work, which don't and what the caveats are. It's a first important step, but it's just that: a first step.
In addition to the things above, we had to:
nodeName
with the pod spec - there seemed to be no scheduler activenobody
. We ran k3s using:
# server
./k3s-feb6feeaeccc857a5744ef10efd82b18e8790e78 server --disable-agent -d /workspace/data/server
# agent
sudo ./k3s-feb6feeaeccc857a5744ef10efd82b18e8790e78 agent -d /workspace/k3s_p1 --token-file /workspace/data/server/server/node-token -s https://localhost:6443 --lb-server-port 6445
The pod spec we started was:
apiVersion: v1
kind: Pod
metadata:
labels:
run: foo
name: foo
namespace: default
resourceVersion: "801"
spec:
securityContext:
runAsUser: 33333
runAsGroup: 33333
fsGroup: 33333
containers:
- image: docker.io/alpine:latest
imagePullPolicy: Always
name: foo
command: ["/bin/sh", "-c", "--"]
args: ["while true; do sleep 30; done;"]
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
enableServiceLinks: true
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
serviceAccount: default
serviceAccountName: default
terminationGracePeriodSeconds: 30
nodeName: gitpodio-templatetypescr-lkag6us7i8b
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
I created a little test to see whether k3d can run in a gitpod workspace:
As of today, this doesn't work yet.
I included a log file for the kubernetes api server when it tried to start. The errors I think are most relevant are these, but you can also see the whole log file:
time="2022-03-18T19:03:50Z" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_close_wait' to 3600"
time="2022-03-18T19:03:50Z" level=error msg="Failed to set sysctl: open /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_close_wait: read-only file system"
time="2022-03-18T19:03:50Z" level=info msg="Set sysctl 'net/netfilter/nf_conntrack_tcp_timeout_established' to 86400"
time="2022-03-18T19:03:50Z" level=error msg="Failed to set sysctl: open /proc/sys/net/netfilter/nf_conntrack_tcp_timeout_established: read-only file system"
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@csweichel Is it possible to start a server in a docker without --privileged? I saw a lot of error msg like when I do such job
@csweichel any updates on the issue?
I was trying to use Kind but couldn't get it up and running. Then I found this issue referenced elsewhere, suggesting that some people had managed to get k3s working, but it looks like that isn't really the case. I tried to get k3s and Kind running in rootless mode but neither worked :(
Getting some flavor of k8s up and running would be extremely helpful.
Any updates? :)
@raphaeltm @HadesArchitect, have you tried https://github.com/gitpod-io/template-k3s?
Thanks @esigo, that must cover a couple of scenarios! Not all of them, but still a good place to start.
Right now, to run k3s in Gitpod the only viable option is to use emulation to create a VM as showed here https://github.com/gitpod-io/template-k3s
I did some analysis of why, even if we are able to run Docker, k3s still does not work.
There are a couple of errors to solve
Error 1 Snapshotter (:heavy_check_mark: )
Solution:
This is already solved, we can use
--snapshotter=fuse-overlayfs
since we fixed fuse support in https://github.com/gitpod-io/gitpod/pull/4594 and https://github.com/gitpod-io/gitpod/pull/4762Error 2 Privileges (:negative_squared_cross_mark: )
The kubelet can run in rootless mode to avoid us dealing with privileged devices, like we do for Docker.
However, running in that mode
Amazing, we can create the uid and gid maps for that user by editing
/etc/subuid
and/etc/subgid
However, after doing that
Looks like the problem is that the current workspace root user cannot write the uid map of any process.
Front conversations