docker / cli

The Docker CLI
Apache License 2.0
4.97k stars 1.94k forks source link

Containerd snapshotter - When QEMU is installed, docker save failing, for images with mismatch arch pulled with "--platform" seems Regression on 27 compared to 24.0.7 #5493

Closed chetanshivaji closed 1 month ago

chetanshivaji commented 1 month ago

Description

When QEMU installed, docker save failing, for images with mismatch arch pulled with "--platform", Seems like it is regression to older Docker version 24.0.7

I pulled image using below command, _docker pull --platform=linux/arm64 debian:latest

docker images REPOSITORY TAG IMAGE ID CREATED SIZE debian latest 27586f460943 6 days ago 205MB

/home/ubuntu# docker save -o debian.tar 27586f460943 Error response from daemon: unable to create manifests file: NotFound: content digest sha256:e225d70fafe80791f18c79b8d76afa1d1b4192b3a40a50f1ffd4de84555ebd04: not found_

On ubuntu,

_/home/ubuntu# qemu- qemu-aarch64-static qemu-cris-static qemu-m68k-static qemu-mips64el-static qemu-or1k-static qemu-riscv64-static qemu-sparc32plus-static qemu-aarch64_be-static qemu-debootstrap qemu-microblaze-static qemu-mipsel-static qemu-ppc-static qemu-s390x-static qemu-sparc64-static qemu-alpha-static qemu-hexagon-static qemu-microblazeel-static qemu-mipsn32-static qemu-ppc64-static qemu-sh4-static qemu-x8664-static qemu-arm-static qemu-hppa-static qemu-mips-static qemu-mipsn32el-static qemu-ppc64le-static qemu-sh4eb-static qemu-xtensa-static qemu-armeb-static qemu-i386-static qemu-mips64-static qemu-nios2-static qemu-riscv32-static qemu-sparc-static qemu-xtensaeb-static

docker info|grep -ani storage WARNING: bridge-nf-call-iptables is disabled 20: Storage Driver: overlayfs WARNING: bridge-nf-call-ip6tables is disabled

Reproduce

  1. install QEMU, apt install qemu-user-static
  2. pull image using --platform flag, mismatch arch to the host.
  3. docker save

Expected behavior

It should succeed in docker save as there is QEMU present.

docker version

docker version
Client: Docker Engine - Community
 Version:           27.3.0
 API version:       1.47
 Go version:        go1.22.7
 Git commit:        e85edf8
 Built:             Thu Sep 19 14:25:59 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          27.3.0
  API version:      1.47 (minimum version 1.24)
  Go version:       go1.22.7
  Git commit:       41ca978
  Built:            Thu Sep 19 14:25:59 2024
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.7.22
  GitCommit:        7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc:
  Version:          1.1.14
  GitCommit:        v1.1.14-0-g2c9f560
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

docker info
Client: Docker Engine - Community
 Version:    27.3.0
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.7.1
    Path:     /root/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.29.6
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 4
  Running: 0
  Paused: 0
  Stopped: 4
 Images: 3
 Server Version: 27.3.0
 Storage Driver: overlayfs
  driver-type: io.containerd.snapshotter.v1
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 7f7fdf5fed64eb6a7caf99b3e12efcf9d60e311c
 runc version: v1.1.14-0-g2c9f560
 init version: de40ad0
 Security Options:
  apparmor
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.0-1015-aws
 Operating System: Ubuntu 22.04.3 LTS
 OSType: linux
 Architecture: x86_64
 CPUs: 2
 Total Memory: 3.821GiB
 Name: ip-10-82-11-11
 ID: 5d3de098-c37d-4184-9a28-756081fef784
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: true
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional Info

Success of docker save on version Docker version 24.0.7, so regression on 27 version.

docker pull --platform=linux/arm64 debian:latest 27586f460943: Download complete 4466c6813e1d: Download complete a2a098df5635: Download complete 6d11c181ebb3: Download complete docker.io/library/debian:latest

root@ip-10-82-11-11:/home# docker images REPOSITORY TAG IMAGE ID CREATED SIZE debian latest 27586f460943 8 seconds ago 205MB

root@ip-10-82-11-11:/home# docker save -o deb.tar 27586f460943

root@ip-10-82-11-11:/home# ls deb.tar

root@ip-10-82-11-11:/home# docker --version Docker version 24.0.7, build afdd53b

root@ip-10-82-11-11:/home# docker info|grep -ani storage 20: Storage Driver: overlayfs

thaJeztah commented 1 month ago

Thanks for reporting; this looks like a duplicate of https://github.com/docker/cli/issues/5476

edit: wrong link

thaJeztah commented 1 month ago

whoops posted the wrong link (fixed the comment above)

chetanshivaji commented 1 month ago

Yes, both are mainly about docker save. This one is slight different with QEMU involvement, and actually a regression.

thaJeztah commented 1 month ago

This one is slight different with QEMU involvement

Thanks! Yes, I don't think the QEMU part is directly related as QEMU is only used to perform userland emulation when running, but I suspect it's due to how the daemon resolves what to export when the containerd image-store is enabled. Prior to the containerd image-store, multi-arch images would not be preserved as a multi-arch image, so any image in docker images / docker image ls would only have a single platform. This allowed the daemon to unconditionally save "whatever" was referred to.

With multi-arch support that's not always possible; a multi-arch image may be fully downloaded (all platforms present), partially downloaded (e.g., 2 out of 5 platforms have been pulled), or only a single variant. The daemon still has a "default" in various places though; i.e., when you pull an image, it will (as a default) attempt to pull your platform's native architecture; in some cases using fallbacks with a priority (e.g. arm v7 -> arm v6).

My suspicion is that some logic in the daemon currently looks for the default (native architecture) to save/export, and failing because of that.

and actually a regression.

I was curious about that indeed; my first inkling was that this may never have worked correctly with the containerd image-store enabled (per the above), and I can confirm that older releases (with containerd image store enabled) did handle this;

docker version --format '{{.Server.Version}}'
# 24.0.9

docker info --format '{{ .Driver }} {{ .DriverStatus }}'
# overlayfs [[driver-type io.containerd.snapshotter.v1]]

docker pull --platform=linux/amd64 alpine

docker images
# REPOSITORY   TAG       IMAGE ID       CREATED         SIZE
# alpine       latest    beefdbd8a1da   4 minutes ago   12.1MB

docker save -o foo.tar beefdbd8a1da

docker save beefdbd8a1da | tar -tf-
# blobs/
# blobs/sha256/
# blobs/sha256/33735bd63cf84d7e388d9f6d297d348c523c044410f553bd878c6d7829612735
# blobs/sha256/43c4264eed91be63b206e17d93e75256a6097070ce643c5e8f0379998b44f170
# blobs/sha256/91ef0af61f39ece4d6710e465df5ed6ca12112358344fd51ae6a3b886634148b
# index.json
# manifest.json
# oci-layout

It's still possible that this was an intentional change in behavior (i.e., default to native platform instead of "pick whatever is found"), but at least the error-reporting should be more clear on that if that's the case.

chetanshivaji commented 1 month ago

@thaJeztah When QEMU is there means we have all supported arch present to run container specific arch. and we can spawn container to take shell access.

So when pulling image docker should pull all arch and not 0B for host mismatch archs. In summary, QEMU should be considered well which comes default with Docker Desktop, in operations of multi platform Containerd images like pull, save. load.