containerd / containerd

An open and reliable container runtime
https://containerd.io
Apache License 2.0
17.18k stars 3.4k forks source link

Rpc error Status failed "invalid UUID length: 0: unkown" #10491

Closed m1dugh closed 2 months ago

m1dugh commented 2 months ago

Description

I have installed containerd on NixOS on a raspberry pi, and it raises errors constantly with the following messages

Jul 22 09:27:03 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:03.545909051Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:03 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:03.657185766Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:03 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:03.768858588Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:03 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:03.880564225Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.004382141Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.115368876Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.229788479Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.340334811Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.450108801Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"
Jul 22 09:27:04 cluster-master-1 containerd[1026]: time="2024-07-22T09:27:04.561611845Z" level=error msg="Status failed" error="invalid UUID length: 0: unknown"

My containerd instance is configured for k8s.

Steps to reproduce the issue

  1. Install NixOS on arm64
  2. Add the following config

    
    virtualisation.containerd = {
        enable = true;
        settings.plugins = {
            "io.containerd.internal.v1.opt".path = "/var/lib/containerd/opt";
            "io.containerd.grpc.v1.cri" = {
                sandbox_image = "registry.k8s.io/pause:3.9";
    
                containerd = {
                    snapshotter = "overlayfs";
                    runtimes.runc.options.SystemdCgroup = true;
                };
            };
        };
    };

Although my issue arises on NixOS, I have no idea whether it is linked to nixos or not

### Describe the results you received and expected

Containerd was expected to work properly, and ctr can pull images and run containers, however, the kubelet fails with sanity check errors.

### What version of containerd are you using?

containerd github.com/containerd/containerd v1.7.16 v1.7.16

### Any other relevant information

$ runc --version runc version 1.1.12 spec: 1.0.2-dev go: go1.22.4 libseccomp: 2.5.5

$ crictl info E0722 09:31:32.586917 1731 remote_runtime.go:633] "Status from runtime service failed" err="rpc error: code = Unknown desc = invalid UUID length: 0: unknown" FATA[0000] getting status of runtime: rpc error: code = Unknown desc = invalid UUID length: 0: unknown

$ uname -a Linux cluster-master-1 6.1.63 #1-NixOS SMP Tue Jan 1 00:00:00 UTC 1980 aarch64 GNU/Linux


### Show configuration if it is related to CRI plugin.

oom_score = 0 root = "/var/lib/containerd" state = "/run/containerd" version = 2 [grpc] address = "/run/containerd/containerd.sock"

[plugins] [plugins."io.containerd.grpc.v1.cri"] sandbox_image = "registry.k8s.io/pause:3.9" [plugins."io.containerd.grpc.v1.cri".cni] bin_dir = "/opt/cni/bin" max_conf_num = 0

[plugins."io.containerd.grpc.v1.cri".containerd] snapshotter = "overlayfs" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true

[plugins."io.containerd.internal.v1.opt"] path = "/var/lib/containerd/opt"

m1dugh commented 2 months ago

Here is the result of

$ ctr plugin list
WARN[0000] Failed to check deprecations                  error="invalid UUID length: 0: unknown"
TYPE                                   ID                       PLATFORMS         STATUS
io.containerd.snapshotter.v1           aufs                     linux/arm64/v8    skip
io.containerd.event.v1                 exchange                 -                 ok
io.containerd.internal.v1              opt                      -                 ok
io.containerd.warning.v1               deprecations             -                 ok
io.containerd.snapshotter.v1           blockfile                linux/arm64/v8    skip
io.containerd.snapshotter.v1           btrfs                    linux/arm64/v8    skip
io.containerd.snapshotter.v1           devmapper                linux/arm64/v8    error
io.containerd.snapshotter.v1           native                   linux/arm64/v8    ok
io.containerd.snapshotter.v1           overlayfs                linux/arm64/v8    ok
io.containerd.snapshotter.v1           zfs                      linux/arm64/v8    skip
io.containerd.content.v1               content                  -                 ok
io.containerd.metadata.v1              bolt                     -                 ok
io.containerd.gc.v1                    scheduler                -                 ok
io.containerd.differ.v1                walking                  linux/arm64/v8    ok
io.containerd.lease.v1                 manager                  -                 ok
io.containerd.streaming.v1             manager                  -                 ok
io.containerd.runtime.v1               linux                    linux/arm64/v8    ok
io.containerd.monitor.v1               cgroups                  linux/arm64/v8    ok
io.containerd.runtime.v2               task                     linux/arm64/v8    ok
io.containerd.runtime.v2               shim                     -                 ok
io.containerd.sandbox.store.v1         local                    -                 ok
io.containerd.sandbox.controller.v1    local                    -                 ok
io.containerd.service.v1               containers-service       -                 ok
io.containerd.service.v1               content-service          -                 ok
io.containerd.service.v1               diff-service             -                 ok
io.containerd.service.v1               images-service           -                 ok
io.containerd.service.v1               introspection-service    -                 ok
io.containerd.service.v1               namespaces-service       -                 ok
io.containerd.service.v1               snapshots-service        -                 ok
io.containerd.service.v1               tasks-service            -                 ok
io.containerd.grpc.v1                  containers               -                 ok
io.containerd.grpc.v1                  content                  -                 ok
io.containerd.grpc.v1                  diff                     -                 ok
io.containerd.grpc.v1                  events                   -                 ok
io.containerd.grpc.v1                  images                   -                 ok
io.containerd.grpc.v1                  introspection            -                 ok
io.containerd.grpc.v1                  leases                   -                 ok
io.containerd.grpc.v1                  namespaces               -                 ok
io.containerd.grpc.v1                  sandbox-controllers      -                 ok
io.containerd.grpc.v1                  sandboxes                -                 ok
io.containerd.grpc.v1                  snapshots                -                 ok
io.containerd.grpc.v1                  streaming                -                 ok
io.containerd.grpc.v1                  tasks                    -                 ok
io.containerd.transfer.v1              local                    -                 ok
io.containerd.grpc.v1                  transfer                 -                 ok
io.containerd.grpc.v1                  version                  -                 ok
io.containerd.internal.v1              restart                  -                 ok
io.containerd.tracing.processor.v1     otlp                     -                 skip
io.containerd.internal.v1              tracing                  -                 skip
io.containerd.grpc.v1                  healthcheck              -                 ok
io.containerd.nri.v1                   nri                      -                 ok
io.containerd.grpc.v1                  cri                      linux/arm64/v8    ok
m1dugh commented 2 months ago

After rolling back to previous version, it looks like this bug does not arise on containerd version 1.7.13

zouyee commented 2 months ago

https://github.com/google/uuid/issues/114

m1dugh commented 2 months ago

I get that there is an error with uuid, the main issue was the reason why containerd was failing.

leiqi96 commented 2 months ago
  1. The different between v1.7.16 and v1.7.13 is cmd/ctr/commands/client.go. ctr v1.7.13 can not trigger introspection service v1.7.16 : https://github.com/containerd/containerd/blob/v1.7.16/cmd/ctr/commands/client.go#L78 v1.7.13: it does not have this line. https://github.com/containerd/containerd/blob/v1.7.13/cmd/ctr/commands/client.go#L62

    if !suppressDeprecationWarnings {
        resp, err := client.IntrospectionService().Server(ctx, &ptypes.Empty{})
        if err != nil {
            log.L.WithError(err).Warn("Failed to check deprecations")
        } else {`
  2. Then, you can checkout the path below, maybe the content of this path is the root cause.

    /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid

    if the path does not exit, containerd will create it.

m1dugh commented 2 months ago

This file is empty indeed, it might be the cause then.

leiqi96 commented 2 months ago

When containerd starts,/var/lib/containerd/io.containerd.grpc.v1.introspection is not created. Introspection is a componet of containerd. The path /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid is created when introspection service was triggered by grpc.

you can test this cause with manually populating file /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid

leiqi96 commented 2 months ago

Pay attention to difference between ctr v1.7.13 and v1.7.16

estesp commented 2 months ago

@samuelkarp ^ the above sounds interesting; is it possible that there is a chance that there is something not fully initialized re: introspection and the deprecation checks?

leiqi96 commented 2 months ago

Can you check the permissions of path /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid. I guess introspection fails to write content to the path.

samuelkarp commented 2 months ago

@estesp Nothing changed with deprecations except that we started invoking the introspection service on every ctr command. It's not clear why /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid is empty on @m1dugh's Raspberry Pi. We can regenerate the UUID file if the length is 0 fairly easily, but if containerd can't write into the file there's likely a deeper problem somewhere else.

@m1dugh As a workaround, try deleting /var/lib/containerd/io.containerd.grpc.v1.introspection/uuid and see if that fixes it. Meanwhile I've opened https://github.com/containerd/containerd/pull/10503 to regenerate if this situation recurs.