dragonflyoss / nydus

Nydus - the Dragonfly image service, providing fast, secure and easy access to container images.
https://nydus.dev/
Apache License 2.0
1.17k stars 202 forks source link

failed to create snapshot: missing parent bucket: not found #151

Open 844700118 opened 3 years ago

844700118 commented 3 years ago

1. the question: crictl runp container-config.yaml pod-config.yaml error FATA[0000] run pod sandbox: rpc error: code = NotFound desc = failed to create containerd container: failed to create snapshot: missing parent "k8s.io/2/sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770" bucket: not found

2. configuration information vi /etc/containerd/config.toml version = 2 root = "/var/lib/containerd" state = "/run/containerd" [proxy_plugins] [proxy_plugins.nydus] type = "snapshot" address = "/run/containerd/containerd-nydus-grpc.sock" ...... [plugins] [plugins."io.containerd.grpc.v1.cri"] sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2" systemd_cgroup = true [plugins."io.containerd.grpc.v1.cri".containerd] snapshotter = "nydus" disable_snapshot_annotations = false [plugins."io.containerd.grpc.v1.cri".containerd.runtimes] [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runtime.v1.linux" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true ......

vi /etc/nydusd-config.json { "device": { "backend": { "type": "registry", "config": { "scheme": "http", "host": "192.168.1.130:8099", "auth": "YWRtaW46SGFyYm9yMTIzNDU=", "timeout": 5, "connect_timeout": 5, "retry_limit": 0 } }, "cache": { "type": "blobcache", "compressed": true, "config": { "work_dir": "cache" } } }, "mode": "direct", "digest_validate": false, "iostats_files": true, "enable_xattr": false, "fs_prefetch": { "enable": true, "threads_count": 10, "bandwidth_rate": 1048576 } }

3. version information runc -v runc version 1.0.0 commit: v1.0.0-0-g84113eef spec: 1.0.2-dev go: go1.15.14 libseccomp: 2.3.1

containerd -v containerd github.com/containerd/containerd v1.4.8 7eba5930496d9bbe375fdf71603e610ad737d2b2

4. service status systemctl status containerd -l ● containerd.service - containerd container runtime Loaded: loaded (/etc/systemd/system/containerd.service; enabled; vendor preset: disabled) Active: active (running) since Sat 2021-08-21 15:37:40 CST; 21s ago Docs: https://containerd.io Process: 466524 ExecStartPre=/sbin/modprobe overlay (code=exited, status=1/FAILURE) Main PID: 466528 (containerd) Tasks: 14 Memory: 21.8M CGroup: /system.slice/containerd.service └─466528 /usr/local/bin/containerd

Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.810949181+08:00" level=warning msg="The image sha256:495471c7203f56f5444fb79029e0b5cd72709d3533b136b17044d7dcd86b3fef is not unpacked." Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.811664147+08:00" level=warning msg="The image sha256:7ce0143dee376bfd2937b499a46fb110bda3c629c195b84b1cf6e19be1a9e23b is not unpacked." Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.814468869+08:00" level=warning msg="The image sha256:80d28bedfe5dec59da9ebf8e6260224ac9008ab5c11dbbe16ee3ba3e4439ac2c is not unpacked." Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.815782268+08:00" level=warning msg="The image sha256:911b9682e03b53bd2bf63bb212163dc9edc2fcab7999028936f74672ae0740fb is not unpacked." Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.816301301+08:00" level=info msg="Start event monitor" Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.816334106+08:00" level=info msg="Start snapshots syncer" Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.816344412+08:00" level=info msg="Start cni network conf syncer" Aug 21 15:37:40 k8s-master containerd[466528]: time="2021-08-21T15:37:40.816368754+08:00" level=info msg="Start streaming server" Aug 21 15:37:58 k8s-master containerd[466528]: time="2021-08-21T15:37:58.150714933+08:00" level=info msg="RunPodsandbox for &PodSandboxMetadata{Name:nydus-container,Uid:,Namespace:,Attempt:0,}" Aug 21 15:37:58 k8s-master containerd[466528]: time="2021-08-21T15:37:58.339257042+08:00" level=error msg="RunPodSandbox for &PodSandboxMetadata{Name:nydus-container,Uid:,Namespace:,Attempt:0,} failed, error" error="rpc error: code = NotFound desc = failed to create containerd container: failed to create snapshot: missing parent \"k8s.io/2/sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770\" bucket: not found"

5. snapshotter containerd-nydus-grpc containerd-nydus-grpc --nydusd-path /usr/bin/nydusd --config-path /etc/nydusd-config.json --log-level trace --root /var/lib/containerd/io.containerd.snapshotter.v1.nydus --address /run/containerd/containerd-nydus-grpc.sock

{"level":"info","msg":"found 0 daemons running","time":"2021-08-21T15:37:34.860829122+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-206497482","level":"info","msg":"cleanupSnapshotDirectory /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-206497482","time":"2021-08-21T15:37:58.229247885+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-206497482","error":"failed to get daemon by snapshotID (new-206497482)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.229320240+08:00"} {"level":"info","msg":"umount nydus daemon of id new-206497482, mountpoint /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-206497482","time":"2021-08-21T15:37:58.229383913+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-206497482","error":"failed to get daemon by snapshotID (new-206497482)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.229437402+08:00"} {"key":"k8s.io/50/extract-231584148-3QGH sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770","level":"info","msg":"prepare key k8s.io/50/extract-231584148-3QGH sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770 parent labels","parent":"","time":"2021-08-21T15:37:58.235293375+08:00"} {"key":"k8s.io/50/extract-231584148-3QGH sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770","level":"info","msg":"prepare for container layer k8s.io/50/extract-231584148-3QGH sha256:ba0dae6243cc9fa2890df40a625721fdbea5c94ca6da897acdd814d710149770","parent":"","time":"2021-08-21T15:37:58.235357255+08:00"} {"level":"info","msg":"id 7 is data layer, continue to check parent layer","time":"2021-08-21T15:37:58.235391750+08:00"} {"level":"info","msg":"id 7 is data layer, continue to check parent layer","time":"2021-08-21T15:37:58.235433881+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-387506828","level":"info","msg":"cleanupSnapshotDirectory /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-387506828","time":"2021-08-21T15:37:58.268876238+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-387506828","error":"failed to get daemon by snapshotID (new-387506828)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.268939370+08:00"} {"level":"info","msg":"umount nydus daemon of id new-387506828, mountpoint /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-387506828","time":"2021-08-21T15:37:58.268949007+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/new-387506828","error":"failed to get daemon by snapshotID (new-387506828)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.268954242+08:00"} {"level":"info","msg":"cleanup: dirs=[/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7]","time":"2021-08-21T15:37:58.331032998+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7","level":"info","msg":"cleanupSnapshotDirectory /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7","time":"2021-08-21T15:37:58.331099764+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7","error":"failed to get daemon by snapshotID (7)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.331257899+08:00"} {"level":"info","msg":"umount nydus daemon of id 7, mountpoint /var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7","time":"2021-08-21T15:37:58.331278128+08:00"} {"dir":"/var/lib/containerd/io.containerd.snapshotter.v1.nydus/snapshots/7","error":"failed to get daemon by snapshotID (7)","level":"error","msg":"failed to unmount","time":"2021-08-21T15:37:58.331285608+08:00"}

6. It is normal to start a normal container without nydus.

ctr run -d 192.168.1.130:8099/test/nginx:alpine nginx-alpine

nerdctl ps

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

nginx-alpine 192.168.1.130:8099/test/nginx:alpine "/docker-entrypoint.…" 13 seconds ago Up

changweige commented 3 years ago

Hi, are you still suffering from this issue? If yes, please try to remove boltdb store file from nydus home directory. There might be some inconsistency between containerd and nydus snapshotter. The db file might locate at $containerd_home/io.containerd.snapshotter.v1.nydus

zvier commented 1 year ago

Hi, are you still suffering from this issue? If yes, please try to remove boltdb store file from nydus home directory. There might be some inconsistency between containerd and nydus snapshotter. The db file might locate at $containerd_home/io.containerd.snapshotter.v1.nydus

For my case, just remove $containerd_home/io.containerd.snapshotter.v1.nydus not work until I remove $containerd_home. But this method can only used on test host. For production environment, mabe need a self recovery strategy.

changweige commented 1 year ago

Hi, are you still suffering from this issue? If yes, please try to remove boltdb store file from nydus home directory. There might be some inconsistency between containerd and nydus snapshotter. The db file might locate at $containerd_home/io.containerd.snapshotter.v1.nydus

For my case, just remove $containerd_home/io.containerd.snapshotter.v1.nydus not work until I remove $containerd_home. But this method can only used on test host. For production environment, mabe need a self recovery strategy.

Can I know what version or commit of nydus-snapshotter you are working on?

adamqqqplay commented 1 year ago

Please try ctr -n k8s.io content fetch $pause-image-name, the $pause-image-name is the k8s pause image you are using.

This case occurs after using the default snapshotter (usually overlay snapshotter) of containerd to pull OCI v1 images and then switching to nydus snapshotter.