containers / crun

A fast and lightweight fully featured OCI runtime and C library for running containers
GNU General Public License v2.0
2.87k stars 293 forks source link

Pod cannot be deleted due to missing container startup command when using crun #1482

Open Bevisy opened 2 weeks ago

Bevisy commented 2 weeks ago

What happened?

using pod-config.json and container-config.json to create pod:

# cat pod-config.json
{
    "metadata": {
        "name": "nginx-sandbox",
        "namespace": "default",
        "attempt": 1,
        "uid": "hdishd83djaidwnduwk28bcsb"
    },
    "log_directory": "/tmp",
    "linux": {
    }
}

# cat container-config-nginx.json
{
  "metadata": {
      "name": "nginx-0"
  },
  "image":{
      "image": "docker.io/library/nginx:latest"
  },
  "command": [
      "top"
  ],
  "linux": {
  }
}

Then, we could find the container was created failed:

# crictl run container-config-nginx.json pod-config.json
FATA[0012] running container: creating container failed: rpc error: code = Unknown desc = create container: create result: internal/proto/conmon.capnp:Conmon.createContainer: Failed: child command exited with: 1: executable file `top` not found in $PATH: No such file or directory

At this point, the container process on the node becomes a zombie process, and the pod cannot be deleted.

      1   15487   15486    2552 pts/1      11037 Sl       0   0:00 /usr/bin/crio-conmonrs --runtime /usr/bin/crio-crun --runtime-dir /var/lib/containers/storage/overlay-containers/7d46c4f2908be02f02465923ca1aca87295e8872231dae236287fe69209fdec9/userdata --runtime-root /run/crun --log-level info --log-driver systemd --cgroup-manager systemd
  15487   15496   15496   15496 ?             -1 Ss       0   0:00  \_ /pause
  15487   15509   15486    2552 pts/1      11037 Z        0   0:00  \_ [3] <defunct>

However, this issue does not occur when using runc:

      1    9191    9190    2127 pts/1       9081 Sl       0   0:00 /usr/bin/crio-conmonrs --runtime /usr/bin/crio-runc --runtime-dir /var/lib/containers/storage/overlay-containers/408c6d69af793e8a90489a61b250741f0b47d6cce8a140e28b4b604e06cae0f0/userdata --runtime-root /run/runc --log-level info --log-driver systemd --cgroup-manager systemd
   9191    9209    9209    9209 ?             -1 Ss       0   0:00  \_ /pause

So, what could be the reason for this?

What did you expect to happen?

Expect the container process to exit normally instead of becoming a zombie process.

How can we reproduce it (as minimally and precisely as possible)?

See what happened.

Anything else we need to know?

No response

CRI-O and Kubernetes version

```console $ crio --version crio version 1.31.0 Version: 1.31.0 GitCommit: a51dfb336a1d3847415dfa871e81d003e4ef79ae GitCommitDate: 2024-05-21T07:18:21Z GitTreeState: dirty GoVersion: go1.22.3 Compiler: gc Platform: linux/amd64 Linkmode: dynamic BuildTags: containers_image_ostree_stub libdm_no_deferred_remove seccomp selinux LDFlags: unknown SeccompEnabled: true AppArmorEnabled: false $ crio-conmonrs v0.6.3 ```

OS version

```console # On Linux: $ cat /etc/os-release PRETTY_NAME="Debian GNU/Linux 12 (bookworm)" NAME="Debian GNU/Linux" VERSION_ID="12" VERSION="12 (bookworm)" VERSION_CODENAME=bookworm ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" $ uname -a Linux lima-crio 6.1.0-21-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux ```

Additional environment details (AWS, VirtualBox, physical, etc.)

nothing else
saschagrunert commented 2 weeks ago

The fact that it works with runc and conmon-rs makes me wonder what crun does differently.

OTOH the above use case works pretty well when using:

giuseppe commented 2 weeks ago

what is the command line used by conmon-rs to run crun?

saschagrunert commented 2 weeks ago

I added some debug statement to conmon-rs and it runs:

crun \
    --root=/run/runc \
    --systemd-cgroup \
    create \
    --bundle /run/containers/storage/overlay-containers/dc31d87ed3b6530f23411374f44d4d84b4da8812d8af9b0e90258e04eb2ad03f/userdata \
    --pid-file /run/containers/storage/overlay-containers/dc31d87ed3b6530f23411374f44d4d84b4da8812d8af9b0e90258e04eb2ad03f/userdata/pidfile \
    dc31d87ed3b6530f23411374f44d4d84b4da8812d8af9b0e90258e04eb2ad03f
giuseppe commented 2 weeks ago

thanks. The command line looks correct