containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.13k stars 2.36k forks source link

[windows/wsl] binfmt does not work in default machine #19961

Open liketechnik opened 1 year ago

liketechnik commented 1 year ago

Issue Description

The default machine (WSL 2, Fedora) created during installation of podman for windows has broken binfmt support.

Steps to reproduce the issue

$ podman.exe run --rm -it --platform linux/arm docker.io/library/hello-world

Describe the results you received

{"msg":"exec container process `/hello`: Exec format error","level":"error","time":"2023-09-13T10:25:25.742476Z"}

Describe the results you expected

The default output of the hello-world image (c.f. https://hub.docker.com/_/hello-world, section "Example Output")

podman info output

$ podman.exe info
podman.exe info
host:
  arch: amd64
  buildahVersion: 1.29.0
  cgroupControllers:
  - cpuset
  - cpu
  - cpuacct
  - blkio
  - memory
  - devices
  - freezer
  - net_cls
  - perf_event
  - net_prio
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v1
  conmon:
    package: conmon-2.1.5-1.fc36.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.5, commit: '
  cpuUtilization:
    idlePercent: 99.44
    systemPercent: 0.22
    userPercent: 0.34
  cpus: 8
  databaseBackend: ""
  distribution:
    distribution: fedora
    variant: container
    version: "36"
  eventLogger: journald
  hostname: <removed>
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.90.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 13294120960
  memTotal: 16775049216
  networkBackend: netavark
  networkBackendInfo:
    backend: ""
    dns: {}
  ociRuntime:
    name: crun
    package: crun-1.8.1-1.fc36.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.1
      commit: f8a096be060b22ccd3d5f3ebe44108517fbf6c30
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: ""
    package: ""
    version: ""
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.0-0.2.beta.0.fc36.x86_64
    version: |-
      slirp4netns version 1.2.0-beta.0
      commit: 477db14a24ff1a3de3a705e51ca2c4c1fe3dda64
      libslirp: 4.6.1
      SLIRP_CONFIG_VERSION_MAX: 3
      libseccomp: 2.5.3
  swapFree: 4294967296
  swapTotal: 4294967296
  uptime: 4h 34m 39.00s (Approximately 0.17 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 1081101176832
  graphRootUsed: 13543485440
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 24
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.4.1
  Built: 1676629882
  BuiltTime: Fri Feb 17 11:31:22 2023
  GitCommit: ""
  GoVersion: go1.18.10
  Os: linux
  OsArch: linux/amd64
  Version: 4.4.1

Podman in a container

Yes

Privileged Or Rootless

Privileged

Upstream Latest Release

Yes

Additional environment details

$ cat /etc/os-release # inside the default machine
NAME="Fedora Linux"
VERSION="36 (Container Image)"
ID=fedora
VERSION_ID=36
VERSION_CODENAME=""
PLATFORM_ID="platform:f36"
PRETTY_NAME="Fedora Linux 36 (Container Image)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:36"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f36/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=36
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=36
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
SUPPORT_END=2023-05-16
VARIANT="Container Image"
VARIANT_ID=container

Additional information

The problem can be worked around by:

  1. Installing systemd-udev, so that systemd-binfmt and systemd-binfmt.service are available: sudo dnf install systemd-udev
  2. Working around a WSL bug (https://github.com/ubuntu/WSL/issues/334, https://github.com/microsoft/WSL/issues/8952): sudo sh -c 'echo :WSLInterop:M::MZ::/init:PF > /usr/lib/binfmt.d/WSLInterop.conf'
  3. restarting the systemd-binfmt service to apply changes: sudo systemctl restart systemd-binfmt.service
github-actions[bot] commented 11 months ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 11 months ago

@n1hility @mheon PTAL

github-actions[bot] commented 10 months ago

A friendly reminder that this issue had no activity for 30 days.

Aetylus commented 5 months ago

I've been looking into multi-arch support on Windows Podman for a couple days now, and unfortunately the resources on this is sparse as it doesn't seem to be a common dev environment. From what I've gathered though, this seems to be an issue with the binfmt_misc configuration in the podman machine.

The immediate fixes for this are either using the multiarch/qemu-user-static container:

$ podman run --rm --privileged multiarch/qemu-user-static --reset -p yes

or, essentially what it's doing is running the qemu-binfmt-conf.sh from the qemu scripts (seen here). This can be accomplished instead by doing the following:

$ podman machine ssh
$ curl https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-binfmt-conf.sh > qemu-binfmt-conf.sh
$ chmod +x qemu-binfmt-conf.sh
$ sudo ./qemu-binfmt-conf.sh --qemu-suffix "-static" --qemu-path "/usr/bin" -p yes

However, this doesn't persist across restarts. If you restart your computer or do a podman machine stop and podman machine start, you must redo the binfmt_misc conf by either running the multiarch/qemu-user-static container or running the qemu-binfmt-conf.sh script in your podman machine.

Unfortunately I've not found a solution that feels more proper and permanent than this. From what I've seen, there are mentions of systemd-binfmt but that doesn't exist in the default podman machine setup on Windows and I am unsure if it's even possible to get (I couldn't find any packages that provide it) or if there is another piece or pieces involved.

None of this is particularly well documented or discussed anywhere when it comes to Windows so any insight would be appreciated.

[Edit] I missed the end and I see that systemd-binfmt is in the systemd-udev package. I do wish if nothing else this problem was documented somewhere for Windows.

[Edit] I was trying the proposed fix and actually remade my podman machine. It does actually appear however at on the Podman version I am on (4.9.4), the default podman machine is actually Fedora 39. And on attempting to create a container image different from the host machine's architecture, it does seem to work without any changes being made? This isn't the case. I must have confused myself but even remaking the podman machine, building for ARM doesn't work without the above fixes. Still unsure about running for ARM.

[Edit] Okay disregard most of this. I think the latest version fixes the issues for building to say ARM, but not for running for ARM. I'm getting the same error when running:

$ podman run --rm -it --platform linux/arm docker.io/library/hello-world

Notably the fix mentioned didn't seem to work for me either.

[Edit] What does seem to work for both building and running is doing a combination of running the qemu-binfmt-conf.sh and installing system-udev and starting the systemd-binfmt service. This appears to allow both running and building for different architectures. I'm actually having difficulty consistently reproducing this. I removed and reinitialized a podman machine a second time and that seems to work for both building and running now without any changes. The commands I'm using to test are

$ podman run --rm -it --platform linux/arm docker.io/library/hello-world

and

$ podman build --platform linux/arm64/v8 --tag localhost/arm-test .

with the following Containerfile:

FROM oraclelinux:8

RUN echo "Hello world"

CMD ["echo", "Hello world"]

[Edit] I've tried removing the default machine and reinitializing it like six times in a row now after it wasn't working and now it is consistently working without changes. I'm kind of at a loss at what makes it work and what doesn't honestly.

Aetylus commented 5 months ago

Going to post this as a new comment because I think I got lost in the weeds in the last one.

I think the weird inconsistencies I was having was due to either something persisting between different podman machines or cacheing of some kind. After it was working consistently, it started no longer working consistently, and since I can't seem to replicate it in either way confidently, just going to ignore it.

I do notice in the default podman machine (now on Fedora 39), there is a mount defined in /proc/self/mountinfo and systemctl status proc-sys-fs-binfmt_misc.mount implies to me that the directory is mounted but it's not actually? You can try sudo umount /proc/sys/fs/binfmt_misc and will get back

umount: /proc/sys/fs/binfmt_misc: not mounted.

Concerning system-udev, this fix doesn't work for me. The systemd-binfmt service will run the first time just fine and seems to temporarily work in mounting and configuring binfmt, but if you stop and start the machine you will run into an issue where the service fails due to the following:

Failed to check if /proc/sys/fs/binfmt_misc is mounted: Too many levels of symbolic links

Ultimately, the only valid temporary fixes I found for both building and running multi-architecture containers on Windows is using the multiarch/qemu-user-static container or the manual commands listed initially. I think technically this should be preceeded with a binfmt_misc mount, so it should look something like this:

$ podman machine ssh
$ curl https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-binfmt-conf.sh > qemu-binfmt-conf.sh
$ chmod +x qemu-binfmt-conf.sh
$ sudo mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
$ sudo ./qemu-binfmt-conf.sh --qemu-suffix "-static" --qemu-path "/usr/bin" -p yes

Again, this only works so long as the machine stays up. If it doesn't, you will need to remount and rerun the configuration script. Or if you're using the container method, rerun the container.

cdrage commented 4 months ago

We're experiencing this as well on the Podman Desktop side. Looks to be related to the qemu as per @Aetylus 's awesome investigation. Using the:

$ podman machine ssh
$ curl https://raw.githubusercontent.com/qemu/qemu/master/scripts/qemu-binfmt-conf.sh > qemu-binfmt-conf.sh
$ chmod +x qemu-binfmt-conf.sh
$ sudo mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
$ sudo ./qemu-binfmt-conf.sh --qemu-suffix "-static" --qemu-path "/usr/bin" -p yes

Looks to mitigate the problem for now (until you restart the podman machine).

gajanak commented 4 days ago

Only for Documentation, another way of fixing binfmt mountings

Before:

podman run --rm -it --platform linux/arm docker.io/library/hello-world
{"msg":"exec container process `/hello`: Exec format error","level":"error","time":"2024-09-14T09:36:27.872578Z"}

SSH in your machine by podman machine ssh, or wsl -d...

Fixing the binfmt mount options by masking the auto generated ubuntu services:

systemctl mask proc-sys-fs-binfmt_misc.mount
systemctl mask proc-sys-fs-binfmt_misc.automount
systemctl edit systemd-binfmt.service

In the open VI Add the following lines:

[Unit]
ConditionPathIsMountPoint=

[Service]
ExecStartPre=-mount --onlyonce /proc/sys/fs/binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc  -v

Now on start of machine the proc mount should manual be valid. (podman machine stop/start)

Now the command will work:

podman run --rm -it --platform linux/arm docker.io/library/hello-world

Please report/comment me if anyone has a simply or cleaner solution, and if this help you. Perhaps this should go into some Windows WSL FAQ.

PS: I will not blame podman for this - I think this is a incompatibility of the used ubuntu images vs. WSL.

PS2: As Alternative to resolv DNS failure by adding generateResolvConf = false -- in current WSL you can setup dnsProxy=false in your %USER%/.wslconfig file before doing a podman machine init.