cri-o / cri-o

Open Container Initiative-based implementation of Kubernetes Container Runtime Interface
https://cri-o.io
Apache License 2.0
5.09k stars 1.05k forks source link

Pod creation fails with CRI-O on kata-qemu runtime #8322

Open visheshtanksale opened 3 days ago

visheshtanksale commented 3 days ago

What happened?

Setup Kata using kata deploy on CRI-O. When creating a test pod I get error below

Jun 21 11:51:02 ipp1-1848 crio[613259]: time="2024-06-21T11:51:02.647812257Z" level=error msg="createContainer failed" error="rpc error: code = Internal desc = the file /bin/bash was not found" name=containerd-shim-v2 pid=614221 

If I try to bring up any other container using kata-qemu runtime I get similar error that the command which is entrypoint of the container is not found

Attached crio log here Attached kata log here

Qemu and kata version are below

[Hypervisor]
  MachineType = "q35"
  Version = "QEMU emulator version 7.2.0 (kata-static)\nCopyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers"
  Path = "/opt/kata/bin/qemu-system-x86_64"
  BlockDeviceDriver = "virtio-scsi"
  EntropySource = "/dev/urandom"
  SharedFS = "virtio-fs"
  VirtioFSDaemon = "/opt/kata/libexec/virtiofsd"
  SocketPath = ""
  Msize9p = 8192
  MemorySlots = 10
  HotPlugVFIO = "no-port"
  ColdPlugVFIO = "no-port"
  PCIeRootPort = 0
  PCIeSwitchPort = 0
  Debug = true
  [Hypervisor.SecurityInfo]
    Rootless = false
    DisableSeccomp = false
    GuestHookPath = ""
    EnableAnnotations = ["enable_iommu", "virtio_fs_extra_args", "kernel_params"]
    ConfidentialGuest = false

[Runtime]
  Path = "/opt/kata/bin/kata-runtime"
  GuestSeLinuxLabel = ""
  Debug = true
  Trace = false
  DisableGuestSeccomp = true
  DisableNewNetNs = false
  SandboxCgroupOnly = false
  [Runtime.Config]
    Path = "/opt/kata/share/defaults/kata-containers/configuration-qemu.toml"
  [Runtime.Version]
    OCI = "1.1.0+dev"
    [Runtime.Version.Version]
      Semver = "3.5.0"
      Commit = "cce735a09e7374ee52a3b4f5d4a4923e9af07f73"
      Major = 3
      Minor = 5
      Patch = 0

Opened an issue on kata-containers @littlejawa suggest adding the storage overlay config

[crio]
  storage_option = [
    "overlay.skip_mount_home=true",
  ]

But this doesnt help.

What did you expect to happen?

The pod should come up without error

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

No response

CRI-O and Kubernetes version

```console $ crio --version crio version 1.31.0 Version: 1.31.0 GitCommit: 004b5dc40823f9bce9b34c6da2a769778725c0f5 GitCommitDate: 2024-06-18T16:24:04Z GitTreeState: clean BuildDate: 1970-01-01T00:00:00Z GoVersion: go1.22.3 Compiler: gc Platform: linux/amd64 Linkmode: static BuildTags: static netgo osusergo exclude_graphdriver_btrfs exclude_graphdriver_devicemapper seccomp apparmor selinux exclude_graphdriver_devicemapper LDFlags: unknown SeccompEnabled: true AppArmorEnabled: true ``` ```console $ kubectl version --output=json { "clientVersion": { "major": "1", "minor": "28", "gitVersion": "v1.28.11", "gitCommit": "f25b321b9ae42cb1bfaa00b3eec9a12566a15d91", "gitTreeState": "clean", "buildDate": "2024-06-11T20:20:18Z", "goVersion": "go1.21.11", "compiler": "gc", "platform": "linux/amd64" }, "kustomizeVersion": "v5.0.4-0.20230601165947-6ce0bf390ce3", "serverVersion": { "major": "1", "minor": "28", "gitVersion": "v1.28.11", "gitCommit": "f25b321b9ae42cb1bfaa00b3eec9a12566a15d91", "gitTreeState": "clean", "buildDate": "2024-06-11T20:11:29Z", "goVersion": "go1.21.11", "compiler": "gc", "platform": "linux/amd64" } } ```

OS version

```console # On Linux: $ cat /etc/os-release PRETTY_NAME="Ubuntu 22.04.3 LTS" NAME="Ubuntu" VERSION_ID="22.04" VERSION="22.04.3 LTS (Jammy Jellyfish)" VERSION_CODENAME=jammy ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=jammy $ uname -a Linux ipp1-1848 5.15.0-101-generic #111-Ubuntu SMP Tue Mar 5 20:16:58 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux ```

Additional environment details (AWS, VirtualBox, physical, etc.)

visheshtanksale commented 3 days ago

cc: @zvonkok

haircommander commented 3 days ago

@littlejawa is this something you're helping with or are you looking for reinforcements?

zvonkok commented 3 days ago

@haircommander Yes, he is helping with that, and we're currently out of options and need reinforcements.

haircommander commented 3 days ago

what happens when you create the container with a different oci runtime?

visheshtanksale commented 3 days ago

what happens when you create the container with a different oci runtime?

Non kata containers are created successfully.

littlejawa commented 3 days ago

The symptom is similar to what we saw with kata 3.3.0, where the content of the container's rootfs was not accessible to the runtime. We fixed it in our own CI by adding the flag "storage.overlay.skip_mount_home=true" in crio's config. I'm also fixing it in the same way in the crio CI for kata, in https://github.com/cri-o/cri-o/pull/7958.

In this cluster the flag was not there, so we added it, but it didn't solve the problem. Could crio ignore the flag for some reason? What else could cause the same symptom?