youki-dev / youki

A container runtime written in Rust
https://youki-dev.github.io/youki/
Apache License 2.0
6.31k stars 346 forks source link

Using youki and cri-o to run Kubernetes #584

Closed utam0k closed 2 years ago

utam0k commented 2 years ago

Goals

refs https://github.com/containers/youki/issues/78#issuecomment-999882807

utam0k commented 2 years ago

The status of latest main branch.

Thank you for your answer, now it gives a different outcome

@gattytto The name of the interface file was wrong. Fixed with #579.

logs:

################################
################################
###kubectl describe pod/rustest#
################################
################################

kubectl describe pod/rustest
Name:         rustest
Namespace:    default
Priority:     0
Node:         driven-lizard/2001:----:----:----:----:----:----:1bda
Start Time:   Fri, 31 Dec 2021 14:43:16 -0300
Labels:       name=rust
Annotations:  cni.projectcalico.org/containerID: c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795
              cni.projectcalico.org/podIP: 1100:200::3e:2340/128
              cni.projectcalico.org/podIPs: 1100:200::3e:2340/128
Status:       Running
IP:           1100:200::3e:2340
IPs:
  IP:  1100:200::3e:2340
Containers:
  rust:
    Container ID:   cri-o://e06d3df94150a466707beca53cb3840d8b0b3373eba66bfbb092cb76601ccd0b
    Image:          quay.io/gattytto/rst:29c8045
    Image ID:       quay.io/gattytto/rst@sha256:c3aac85ed499108dbbed0f6c297d7f766b984c2367c5588e49ab60a3a5b44b62
    Port:           <none>
    Host Port:      <none>
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    255
      Started:      Fri, 31 Dec 2021 14:43:37 -0300
      Finished:     Fri, 31 Dec 2021 14:43:37 -0300
    Ready:          False
    Restart Count:  1
    Limits:
      cpu:     1
      memory:  128Mi
    Requests:
      cpu:        1
      memory:     64Mi
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xdwtg (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-xdwtg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  35s                default-scheduler  Successfully assigned default/rustest to driven-lizard
  Normal   Pulling    34s                kubelet            Pulling image "quay.io/gattytto/rst:29c8045"
  Normal   Pulled     15s                kubelet            Successfully pulled image "quay.io/gattytto/rst:29c8045" in 19.361145627s
  Normal   Created    14s (x2 over 14s)  kubelet            Created container rust
  Normal   Pulled     14s                kubelet            Container image "quay.io/gattytto/rst:29c8045" already present on machine
  Normal   Started    13s (x2 over 14s)  kubelet            Started container rust
  Warning  BackOff    12s (x2 over 13s)  kubelet            Back-off restarting failed container

################################
################################
####kubectl logs pod/rustpod####
################################
################################

[DEBUG crates/youki/src/main.rs:92] 2021-12-31T14:43:55.223858793-03:00 started by user 0 with ArgsOs { inner: ["/usr/bin/youki", "--root=/run/youki", "create", "--bundle", "/run/containers/storage/overlay-containers/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601/userdata", "--pid-file", "/run/containers/storage/overlay-containers/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601/userdata/pidfile", "cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601"] }
[DEBUG crates/libcontainer/src/container/init_builder.rs:94] 2021-12-31T14:43:55.231029905-03:00 container directory will be "/run/youki/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601"
[DEBUG crates/libcontainer/src/container/container.rs:191] 2021-12-31T14:43:55.231082509-03:00 Save container status: Container { state: State { oci_version: "v1.0.2", id: "cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601", status: Creating, pid: None, bundle: "/run/containers/storage/overlay-containers/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601/userdata", annotations: Some({}), created: None, creator: None, use_systemd: None }, root: "/run/youki/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601" } in "/run/youki/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601"
[DEBUG crates/libcontainer/src/rootless.rs:50] 2021-12-31T14:43:55.231328266-03:00 This is NOT a rootless container
[INFO crates/libcgroups/src/common.rs:207] 2021-12-31T14:43:55.231384218-03:00 cgroup manager V2 will be used
[DEBUG crates/libcontainer/src/container/builder_impl.rs:87] 2021-12-31T14:43:55.231434156-03:00 Set OOM score to 978
[WARN crates/libcgroups/src/v2/util.rs:41] 2021-12-31T14:43:55.232052462-03:00 Controller rdma is not yet implemented.
[WARN crates/libcgroups/src/v2/util.rs:41] 2021-12-31T14:43:55.232102612-03:00 Controller misc is not yet implemented.
[DEBUG crates/libcgroups/src/v2/hugetlb.rs:16] 2021-12-31T14:43:55.248156252-03:00 Apply hugetlb cgroup v2 config
[DEBUG crates/libcgroups/src/v2/io.rs:21] 2021-12-31T14:43:55.248258705-03:00 Apply io cgroup v2 config
[DEBUG crates/libcgroups/src/v2/pids.rs:17] 2021-12-31T14:43:55.248288706-03:00 Apply pids cgroup v2 config
[WARN crates/libcgroups/src/v2/util.rs:41] 2021-12-31T14:43:55.248364965-03:00 Controller rdma is not yet implemented.
[WARN crates/libcgroups/src/v2/util.rs:41] 2021-12-31T14:43:55.248397320-03:00 Controller misc is not yet implemented.
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.248437761-03:00 unshare or setns: LinuxNamespace { typ: Pid, path: None }
[DEBUG crates/libcontainer/src/process/channel.rs:52] 2021-12-31T14:43:55.248673957-03:00 sending init pid (Pid(31045))
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.249624262-03:00 unshare or setns: LinuxNamespace { typ: Uts, path: Some("/var/run/utsns/8d7fc62f-718c-4ab6-acf3-e8089764ac3c") }
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.249697505-03:00 unshare or setns: LinuxNamespace { typ: Ipc, path: Some("/var/run/ipcns/8d7fc62f-718c-4ab6-acf3-e8089764ac3c") }
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.249723808-03:00 unshare or setns: LinuxNamespace { typ: Network, path: Some("/var/run/netns/8d7fc62f-718c-4ab6-acf3-e8089764ac3c") }
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.249744874-03:00 unshare or setns: LinuxNamespace { typ: Mount, path: None }
[DEBUG crates/libcontainer/src/namespaces.rs:65] 2021-12-31T14:43:55.249813376-03:00 unshare or setns: LinuxNamespace { typ: Cgroup, path: None }
[DEBUG crates/libcontainer/src/rootfs/rootfs.rs:38] 2021-12-31T14:43:55.249834864-03:00 Prepare rootfs: "/etc/containers/storage/driven-lizard/overlay/f445b1472b515a65e86ddd97b4cdb7068d7d7f310078669559206d032721ff6b/merged"
[DEBUG crates/libcontainer/src/rootfs/rootfs.rs:59] 2021-12-31T14:43:55.252266089-03:00 mount root fs "/etc/containers/storage/driven-lizard/overlay/f445b1472b515a65e86ddd97b4cdb7068d7d7f310078669559206d032721ff6b/merged"
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252309015-03:00 Mounting Mount { destination: "/proc", typ: Some("proc"), source: Some("proc"), options: Some(["nosuid", "noexec", "nodev"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252452830-03:00 Mounting Mount { destination: "/dev", typ: Some("tmpfs"), source: Some("tmpfs"), options: Some(["nosuid", "strictatime", "mode=755", "size=65536k"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252556693-03:00 Mounting Mount { destination: "/dev/pts", typ: Some("devpts"), source: Some("devpts"), options: Some(["nosuid", "noexec", "newinstance", "ptmxmode=0666", "mode=0620", "gid=5"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252638548-03:00 Mounting Mount { destination: "/dev/mqueue", typ: Some("mqueue"), source: Some("mqueue"), options: Some(["nosuid", "noexec", "nodev"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252689048-03:00 Mounting Mount { destination: "/sys", typ: Some("sysfs"), source: Some("sysfs"), options: Some(["nosuid", "noexec", "nodev", "ro"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252758427-03:00 Mounting Mount { destination: "/sys/fs/cgroup", typ: Some("cgroup"), source: Some("cgroup"), options: Some(["nosuid", "noexec", "nodev", "relatime", "ro"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:266] 2021-12-31T14:43:55.252795374-03:00 Mounting cgroup v2 filesystem
[DEBUG crates/libcontainer/src/rootfs/mount.rs:274] 2021-12-31T14:43:55.252813100-03:00 Mount { destination: "/sys/fs/cgroup", typ: Some("cgroup2"), source: Some("cgroup"), options: Some([]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252863028-03:00 Mounting Mount { destination: "/dev/shm", typ: Some("bind"), source: Some("/run/containers/storage/overlay-containers/c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795/userdata/shm"), options: Some(["rw", "bind"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.252932483-03:00 Mounting Mount { destination: "/etc/resolv.conf", typ: Some("bind"), source: Some("/run/containers/storage/overlay-containers/c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795/userdata/resolv.conf"), options: Some(["rw", "bind", "nodev", "nosuid", "noexec"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.253056981-03:00 Mounting Mount { destination: "/etc/hostname", typ: Some("bind"), source: Some("/run/containers/storage/overlay-containers/c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795/userdata/hostname"), options: Some(["rw", "bind"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.253142886-03:00 Mounting Mount { destination: "/etc/hosts", typ: Some("bind"), source: Some("/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/etc-hosts"), options: Some(["rw", "rbind", "rprivate", "bind"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.253224861-03:00 Mounting Mount { destination: "/dev/termination-log", typ: Some("bind"), source: Some("/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/containers/rust/17b3e2ac"), options: Some(["rw", "rbind", "rprivate", "bind"]) }
[DEBUG crates/libcontainer/src/rootfs/mount.rs:50] 2021-12-31T14:43:55.253297279-03:00 Mounting Mount { destination: "/var/run/secrets/kubernetes.io/serviceaccount", typ: Some("bind"), source: Some("/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/volumes/kubernetes.io~projected/kube-api-access-xdwtg"), options: Some(["ro", "rbind", "rprivate", "bind"]) }
[DEBUG crates/libcontainer/src/process/container_init_process.rs:124] 2021-12-31T14:43:55.253806957-03:00 readonly path "/proc/bus" mounted
[DEBUG crates/libcontainer/src/process/container_init_process.rs:124] 2021-12-31T14:43:55.253836447-03:00 readonly path "/proc/fs" mounted
[DEBUG crates/libcontainer/src/process/container_init_process.rs:124] 2021-12-31T14:43:55.253857709-03:00 readonly path "/proc/irq" mounted
[DEBUG crates/libcontainer/src/process/container_init_process.rs:124] 2021-12-31T14:43:55.253877810-03:00 readonly path "/proc/sys" mounted
[DEBUG crates/libcontainer/src/process/container_init_process.rs:124] 2021-12-31T14:43:55.253899506-03:00 readonly path "/proc/sysrq-trigger" mounted
[WARN crates/libcontainer/src/process/container_init_process.rs:140] 2021-12-31T14:43:55.253968378-03:00 masked path "/proc/latency_stats" not exist
[WARN crates/libcontainer/src/process/container_init_process.rs:140] 2021-12-31T14:43:55.253995193-03:00 masked path "/proc/timer_stats" not exist
[WARN crates/libcontainer/src/process/container_init_process.rs:140] 2021-12-31T14:43:55.254012268-03:00 masked path "/proc/sched_debug" not exist
[DEBUG crates/libcontainer/src/capabilities.rs:128] 2021-12-31T14:43:55.254107018-03:00 reset all caps
[DEBUG crates/libcontainer/src/capabilities.rs:128] 2021-12-31T14:43:55.254162817-03:00 reset all caps
[DEBUG crates/libcontainer/src/capabilities.rs:135] 2021-12-31T14:43:55.254201415-03:00 dropping bounding capabilities to Some({DacOverride, Setuid, NetBindService, Kill, Fsetid, Fowner, Setgid, Chown, Setpcap})
[WARN crates/libcontainer/src/syscall/linux.rs:139] 2021-12-31T14:43:55.254266346-03:00 CAP_BPF is not supported.
[WARN crates/libcontainer/src/syscall/linux.rs:139] 2021-12-31T14:43:55.254290257-03:00 CAP_CHECKPOINT_RESTORE is not supported.
[WARN crates/libcontainer/src/syscall/linux.rs:139] 2021-12-31T14:43:55.254301861-03:00 CAP_PERFMON is not supported.
[DEBUG crates/libcontainer/src/process/container_main_process.rs:90] 2021-12-31T14:43:55.254581233-03:00 init pid is Pid(31045)
[DEBUG crates/libcontainer/src/container/container.rs:191] 2021-12-31T14:43:55.254632621-03:00 Save container status: Container { state: State { oci_version: "v1.0.2", id: "cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601", status: Created, pid: Some(31045), bundle: "/run/containers/storage/overlay-containers/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601/userdata", annotations: Some({"io.kubernetes.container.terminationMessagePolicy": "File", "io.kubernetes.cri-o.ResolvPath": "/run/containers/storage/overlay-containers/c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795/userdata/resolv.conf", "io.kubernetes.cri-o.TTY": "false", "io.kubernetes.container.terminationMessagePath": "/dev/termination-log", "io.kubernetes.cri-o.Stdin": "false", "io.kubernetes.container.name": "rust", "io.kubernetes.container.hash": "8200690c", "io.kubernetes.cri-o.ImageRef": "d61b000cca08f105c6675916613dc295c707965b75c2f7880615b47a1fbee4dd", "io.kubernetes.cri-o.IP.0": "1100:200::3e:2340", "io.kubernetes.cri-o.MountPoint": "/etc/containers/storage/driven-lizard/overlay/f445b1472b515a65e86ddd97b4cdb7068d7d7f310078669559206d032721ff6b/merged", "io.kubernetes.cri-o.Annotations": "{\"io.kubernetes.container.hash\":\"8200690c\",\"io.kubernetes.container.restartCount\":\"2\",\"io.kubernetes.container.terminationMessagePath\":\"/dev/termination-log\",\"io.kubernetes.container.terminationMessagePolicy\":\"File\",\"io.kubernetes.pod.terminationGracePeriod\":\"30\"}", "io.kubernetes.cri-o.SandboxName": "k8s_rustest_default_278b7de9-61d0-430f-aa67-1e0f88a860b9_0", "io.kubernetes.cri-o.SeccompProfilePath": "", "io.kubernetes.cri-o.StdinOnce": "false", "io.kubernetes.cri-o.Name": "k8s_rust_rustest_default_278b7de9-61d0-430f-aa67-1e0f88a860b9_2", "io.kubernetes.cri-o.Labels": "{\"io.kubernetes.container.name\":\"rust\",\"io.kubernetes.pod.name\":\"rustest\",\"io.kubernetes.pod.namespace\":\"default\",\"io.kubernetes.pod.uid\":\"278b7de9-61d0-430f-aa67-1e0f88a860b9\"}", "io.kubernetes.cri-o.LogPath": "/var/log/pods/default_rustest_278b7de9-61d0-430f-aa67-1e0f88a860b9/rust/2.log", "io.kubernetes.cri-o.SandboxID": "c269c49be507d20f07dc7ecdecd78db2b382cb6d4d16cfd16114d2d09b10a795", "io.kubernetes.pod.namespace": "default", "io.container.manager": "cri-o", "io.kubernetes.cri-o.ContainerType": "container", "io.kubernetes.cri-o.Image": "d61b000cca08f105c6675916613dc295c707965b75c2f7880615b47a1fbee4dd", "io.kubernetes.pod.name": "rustest", "kubernetes.io/config.seen": "2021-12-31T14:43:16.285819888-03:00", "kubernetes.io/config.source": "api", "io.kubernetes.container.restartCount": "2", "io.kubernetes.cri-o.Volumes": "[{\"container_path\":\"/etc/hosts\",\"host_path\":\"/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/etc-hosts\",\"readonly\":false},{\"container_path\":\"/dev/termination-log\",\"host_path\":\"/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/containers/rust/17b3e2ac\",\"readonly\":false},{\"container_path\":\"/var/run/secrets/kubernetes.io/serviceaccount\",\"host_path\":\"/var/lib/kubelet/pods/278b7de9-61d0-430f-aa67-1e0f88a860b9/volumes/kubernetes.io~projected/kube-api-access-xdwtg\",\"readonly\":true}]", "io.kubernetes.pod.terminationGracePeriod": "30", "io.kubernetes.cri-o.ContainerID": "cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601", "io.kubernetes.cri-o.Metadata": "{\"name\":\"rust\",\"attempt\":2}", "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"v1\",\"kind\":\"Pod\",\"metadata\":{\"annotations\":{},\"labels\":{\"name\":\"rust\"},\"name\":\"rustest\",\"namespace\":\"default\"},\"spec\":{\"containers\":[{\"image\":\"quay.io/gattytto/rst:29c8045\",\"name\":\"rust\",\"resources\":{\"limits\":{\"cpu\":1,\"memory\":\"128Mi\"},\"requests\":{\"cpu\":1,\"memory\":\"64Mi\"}}}],\"runtimeClassName\":\"youki\"}}\n", "io.kubernetes.pod.uid": "278b7de9-61d0-430f-aa67-1e0f88a860b9", "io.kubernetes.cri-o.Created": "2021-12-31T14:43:55.178340341-03:00", "io.kubernetes.cri-o.ImageName": "quay.io/gattytto/rst:29c8045"}), created: Some(2021-12-31T17:43:55.254626102Z), creator: Some(0), use_systemd: Some(false) }, root: "/run/youki/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601" } in "/run/youki/cfc87eda48734d53b66b9a7bbe44dab1cd435f8a88d1f17fc27769fa821be601"
[DEBUG crates/libcontainer/src/notify_socket.rs:43] 2021-12-31T14:43:55.341628903-03:00 received: start container
[DEBUG crates/libcontainer/src/process/fork.rs:16] 2021-12-31T14:43:55.341786911-03:00 failed to run fork: EACCES: Permission denied
youki version 0.0.1
commit: 0.0.1-0-597a0f0
gattytto commented 2 years ago

here are reproduction steps for multi-node cri-o kubernetes cluster using LXD vm's.

this can also be replicated using minikube but I use a manual configuration approach and am unaware of minikube ways to use cri-o instead of docker (docker-less setup).

I have been playing around to integrate other runtimes to cri-o for use with kubernetes in different scenarios, like single-node clusters, different architectures (like arm) and different host underlying filesystems (zfs/lvm/xfs), and then trying to find the more canonical way to implement them by using templates (lxc, snap, aptitude, yaml for kubernetes finally).

I hink this solution has 2 parts: part 1: figuring the way to implement the runtime in the different situations individually for all architectures and operating systems as possible, for the engines: windows-containers-shim(is a thing!!), cri-o, containerd, docker.

part 2: make a cluster-wide solution to be implemented in orchestrators (aws/gce/kubernetes/openshift) as they all comply to the same standard. This part could be consisted of:

1: cluster-wide runtimeClass, maybe tags(selectors in kube-slang) to specify arch/os/distro(dinamically/statically built) 2: runtime-operator (set up a CRD and an operator) that keeps track and updates pods calico(implemented in the reproduction steps) is a good example, set up each node with a pod and mount a folder in it to inject the runtime bins from the pod and the configuration files (in node's /opt/cni folder and /etc/cni respectively) from the deployments yaml. This will make it way easier to detect node's specs from the yaml using $VARS.

relevant to kube yamls:

apiVersion: node.k8s.io/v1
  kind: RuntimeClass
  metadata:
    name: youki
  handler: youki
  apiVersion: v1
kind: Pod
metadata:
  name: rustest
  labels:
    name: rust
spec:
  runtimeClassName: youki
  containers:
  - name: rust
    image: quay.io/gattytto/rst:latest
    resources:
      requests:
        memory: "64Mi"
        cpu: 1
      limits:
        memory: "128Mi"
        cpu: 1
gattytto commented 2 years ago

@Furisto you think the "EACCES" error may be the same case as the previous errors of a wrong symlink or reference to the entrypoints?

Furisto commented 2 years ago

@gattytto The container is successfully created, but when it comes to executing the specified executable the error happens. Looks like it's a permission issue but we are running as root, so that's strange. I do not entirely understand your setup but could it be that youki itself is running in a user namespace created by LXD and it is not actually root but trying to access files in the rootfs that are owned by root?

Can you strace the execution? Maybe that gives more insight. Does the same problem occur with runc?

gattytto commented 2 years ago

@gattytto The container is successfully created, but when it comes to executing the specified executable the error happens. Looks like it's a permission issue but we are running as root, so that's strange. I do not entirely understand your setup but could it be that youki itself is running in a user namespace created by LXD and it is not actually root but trying to access files in the rootfs that are owned by root?

Can you strace the execution? Maybe that gives more insight. Does the same problem occur with runc?

this error is only happening with youki (not runc) and also the container. I will put a middleman bash script to add the strace command to the youki runtime execution and come back with more details on the error, thanks for pointing out the possible source of the error.

you can see the current setup in the config generator and the lxd base template files I use to automate my setups, basically Cri-o engine using cgroupfs in "pod" cgroups2 namespace.

gattytto commented 2 years ago

IT WORKS!

it turned out to be an error on my end, when using this docker image. I can not troubleshoot it in detail, I tried with nushell (a shell made with rust) but it didn't work.

then I tried just using a basic example as starting point, it uses musl and it now works, at least for the most basic I am testing which is "wait" for 5 minutes, it gets restarted every time it finishes.

now I see an error when trying to execute commands (apps) inside the container

 kubectl get pod
NAME      READY   STATUS    RESTARTS        AGE
rustest   1/1     Running   3 (2m55s ago)   18m

root@optimum-mayfly:~# kubectl exec pod/rustest -- /myip
Error: failed to load init spec

Caused by:
    0: failed to load spec
    1: io operation failed
    2: No such file or directory (os error 2)

openat(AT_FDCWD, "/run/youki/183af8ddd214c67e446a4b0f1db7aefddba3a22d4656e2b847233f1ced3bfb27/config.json", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
write(2, "Error: ", 7Error: )                  = 7
write(2, "failed to load init spec", 24failed to load init spec) = 24

root@driven-lizard:/var/log# ls /run/youki/183af8ddd214c67e446a4b0f1db7aefddba3a22d4656e2b847233f1ced3bfb27/
notify.sock  state.json  youki_config.json
electrocucaracha commented 2 years ago

FYI, I have just submitted a PR to add support into the Kubespray project .

gattytto commented 2 years ago

@utam0k the first steps for using youki with cri-o for kubernetes pods is succesfull. pod runs and executes app, then app(check ip) fails because it was expected (no internet connectivity given to it). the second test is also succesfull, running the "wait" command for some minutes, the container shows as restarting itself after the countdown finishes.

more advanced tests may be made like /healthz endpoint and kubernetes api connectivity using the kube library (and the use of the kubernetes env vars provided by the runtime).

utam0k commented 2 years ago

@gattytto 💯 Thanks for your challenge!

I'm sorry, but I don't know about kubernetes a lot. What will the test look like?

more advanced tests may be made like /healthz endpoint and kubernetes api connectivity using the kube library (and the use of the kubernetes env vars provided by the runtime).

gattytto commented 2 years ago

@gattytto 💯 Thanks for your challenge!

I'm sorry, but I don't know about kubernetes a lot. What will the test look like?

yes I will bring some more info here for everyone that wants to participate in growing the test pod image spec here.

here is a rough diagram that depicts part of that structure: image

when the container is started, it is provided with network connectivity, and also given some environment variables:

"env": [
                        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                        "TERM=xterm",
                        "HOSTNAME=rustest",
                        "KUBERNETES_PORT=tcp://[1101:300:1:2::1]:443",
                        "KUBERNETES_PORT_443_TCP=tcp://[1101:300:1:2::1]:443",
                        "KUBERNETES_PORT_443_TCP_PROTO=tcp",
                        "KUBERNETES_PORT_443_TCP_PORT=443",
                        "KUBERNETES_PORT_443_TCP_ADDR=1101:300:1:2::1",
                        "KUBERNETES_SERVICE_HOST=1101:300:1:2::1",
                        "KUBERNETES_SERVICE_PORT=443",
                        "KUBERNETES_SERVICE_PORT_HTTPS=443",
                        "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
                ]

additionally to that the container is provided with mounts, one of them very important containing a "secret" (preshared key style) that it will use to communicate with the api-server of kubernetes.

this is the "secret" mount provided to the container:

root@driven-lizard:~# ls /var/lib/kubelet/pods/fa884568-7222-4f36-8ebe-9dd17000444a/volumes/kubernetes.io~projected/kube-api-access-pj2dh
ca.crt  namespace  token

Using the previously provided stuff, the container should have enough data to make a connection to the kube-api server, this can be archieved using the kube "client" feature in the "kube" cargo library.

Through this API the app can query/create/edit kubernetes datastore objects (be them standard kubernetes ones or custom objects created using CRD=custom resource definitions) and many more things (if its given the right to), like starting other pods or services, or querying the state of other pods or services.

For the HEALTHZ endpoint, what the app should do is to set up a web service to respond with "ok" or whatever is required so the kubernetes api can be guessing the status of the app running inside the container. HERE I found some guide around it

Another thing that should be tested is the hability of the app to write logs to both stdout and logfiles in some folder, so they can be later collected by log collection services present in kubernetes.

this is the yaml spec of a pod set up using the pod yaml in the above message of this thread:

root@optimum-mayfly:~# kubectl describe pod/rustest
Name:         rustest
Namespace:    default
Priority:     0
Node:         driven-lizard/2001:----:----:----:----:----:----:1bda
Start Time:   Wed, 12 Jan 2022 13:56:56 -0300
Labels:       name=rust
Annotations:  cni.projectcalico.org/containerID: 4d072e67434531635bbb993da9ed09086de2748d8b715de4ecde4b805400e69b
              cni.projectcalico.org/podIP: 1100:200::3e:2364/128
              cni.projectcalico.org/podIPs: 1100:200::3e:2364/128
Status:       Running
IP:           1100:200::3e:2364
IPs:
  IP:  1100:200::3e:2364
Containers:
  rust:
    Container ID:   cri-o://ac0f62e65526fea7083af03d834450cf2f2c8792e6c40c041990c720a87a72bf
    Image:          quay.io/gattytto/rst:latest
    Image ID:       quay.io/gattytto/rst@sha256:9fe146aafa680be662b002554cee87ab7ef35bddfbe3a5b516ee457a215fb79a
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Wed, 12 Jan 2022 13:57:18 -0300
    Ready:          True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-pj2dh (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             True
  ContainersReady   True
  PodScheduled      True
Volumes:
  kube-api-access-pj2dh:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type    Reason     Age    From               Message
  ----    ------     ----   ----               -------
  Normal  Scheduled  3m58s  default-scheduler  Successfully assigned default/rustest to driven-lizard
  Normal  Pulling    3m57s  kubelet            Pulling image "quay.io/gattytto/rst:latest"
  Normal  Pulled     3m36s  kubelet            Successfully pulled image "quay.io/gattytto/rst:latest" in 20.701607356s
  Normal  Created    3m36s  kubelet            Created container rust
  Normal  Started    3m36s  kubelet            Started container rust

for the last and most difficult part, here are some docs on kubernetes api unit tests, which may be useful to test the api communication part of the app using docker.

electrocucaracha commented 2 years ago

FYI, I have just submitted a PR to add support into the Kubespray project .

@gattytto can you help me to support this use case, I'm not aware of the current status of the project

gattytto commented 2 years ago

FYI, I have just submitted a PR to add support into the Kubespray project .

@gattytto can you help me to support this use case, I'm not aware of the current status of the project

will do!

gattytto commented 2 years ago

I will try to replicate this for youki, thanks for bringing it up @utam0k https://github.com/containers/crun/tree/main/tests/cri-o