containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.59k stars 2.4k forks source link

Mac M1, Podman build spends a lot of time doing "something" not sure what before the actual build steps. Lots of network usage via gvproxy. Even for a tiny image such as alpine #21802

Closed adrian-moisa closed 6 months ago

adrian-moisa commented 8 months ago

Issue Description

Mac M1, Podman build spends a lot of time doing "something" not sure what before the actual build steps. Lots of network usage via gvproxy. Even for a tiny image such as alpine. gvproxy shows at least 1GB of trafic for each of these attempts.

Screenshot 2024-02-23 at 15 49 12

I can confirm that this was not a problem in for the entire of February. I did many builds and they were going quite fast, epsecially when using a multistage build whith preloaded module dependencies. It took less than 30s to complete such a build.

At some point I had a walk outside, the mac went into sleep mode and when I returned I found some unusual behaviour in minikube. I thought that maybe I need to reinstall everything. Since the reinstall, all my builds have this minimum 3-4 minutes of hang time.

I tried:

Full uninstall procedure Eventually I figured out that all I need to remove is minikube. That's why I sorted by categories the cleanup procedure.

# Podman
podman stop -a
podman rm -a
brew uninstall --force podman-desktop
brew uninstall --force podman
rm -rf ~/.local/share/containers
rm -rf ~/.config/containers/
rm ~/.ssh/podman*

# Qemu
brew uninstall --force qemu
rm -rf ~/.qemu
rm -rf ~/.config/qemu

Restart procedure Note: You have to prepare your .rb files as instructed in the tutorial if you want to rollback.

brew install podman.rb
brew reinstall qemu.rb
brew install podman-desktop

# Double check versions
brew list podman
brew list qemu
brew list podman-desktop

# Init podman
podman machine init
podman machine stop
podman machine set --cpus 2 --memory=8192
podman machine start

Steps to reproduce the issue

Describe the results you received

Describe the results you expected

podman should be working fast right away

podman info output

host:
  arch: arm64
  buildahVersion: 1.32.0
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.8-2.fc39.aarch64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 97.48
    systemPercent: 1.49
    userPercent: 1.02
  cpus: 4
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: coreos
    version: "39"
  eventLogger: journald
  freeLocks: 2047
  hostname: localhost.localdomain
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
    uidmap:
    - container_id: 0
      host_id: 501
      size: 1
    - container_id: 1
      host_id: 100000
      size: 1000000
  kernel: 6.6.3-200.fc39.aarch64
  linkmode: dynamic
  logDriver: journald
  memFree: 1307017216
  memTotal: 1979428864
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.8.0-1.fc39.aarch64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.8.0
    package: netavark-1.8.0-2.fc39.aarch64
    path: /usr/libexec/podman/netavark
    version: netavark 1.8.0
  ociRuntime:
    name: crun
    package: crun-1.12-1.fc39.aarch64
    path: /usr/bin/crun
    version: |-
      crun version 1.12
      commit: ce429cb2e277d001c2179df1ac66a470f00802ae
      rundir: /run/user/501/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20231119.g4f1709d-1.fc39.aarch64
    version: |
      pasta 0^20231119.g4f1709d-1.fc39.aarch64-pasta
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/501/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-1.fc39.aarch64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 0
  swapTotal: 0
  uptime: 0h 24m 10.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /var/home/core/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 0
    stopped: 1
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /var/home/core/.local/share/containers/storage
  graphRootAllocated: 106769133568
  graphRootUsed: 3025473536
  graphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 2
  runRoot: /run/user/501/containers
  transientStore: false
  volumePath: /var/home/core/.local/share/containers/storage/volumes
version:
  APIVersion: 4.7.2
  Built: 1698762633
  BuiltTime: Tue Oct 31 15:30:33 2023
  GitCommit: ""
  GoVersion: go1.21.1
  Os: linux
  OsArch: linux/arm64
  Version: 4.7.2

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

No response

Additional information

No response

afbjorklund commented 8 months ago

Do you have a big "build context" (the one you pass as the last argument), that needs tarring up before building?

mkdir empty

cp Dockerfile empty

podman build -t test -f Dockerfile empty

https://docs.docker.com/build/building/context/#dockerignore-files

adrian-moisa commented 8 months ago

TLDR - podman runs slow because it was building from the root of the monorepo. Not all heavy folders where ignored in the dockerignore. used ncdu to find the heavy ones. Even with ignore it still has some lag but it's way better. Still it has best performance when building at individual project folder level instead of monorepo root. Docker can build at monorepo without this delay at all

Great guess. Indeed, I just did the test, and in an isolated folder the build runs as expected. Near instant for a "FROM alpine".

However there's a catch for me. I'm working in a not so large, not so small monorepo. I decided to plan all my builds from the root level of the repo because it's still "work in progress". Most of the go apis borrow models from each other. The shared models will be published to a dedicated artifactory and imported as a dependency. Thus I could copy only each individual api to it's own dockerfile. it's going to be a while until I do this refactor. Until then I need to build from root.

I used ncdu to analyse the folders size. The major culprit seems to be the Flutter app folder, especially the build (2GB) and .dart_tool (800MB) folders.

This also explains why the test on the other mac was even worse. Because I tried it on the desktop and there my gf (a non-tech person) stores a lot of files on the desktop. Which means that context was even bigger than my monorepo. Initially I wasn't aware that her desktop is so full of subfolders.

I tried an improved .dockerignore and it does the trick. It's far faster than no .dockerignore for the build and .dart_tool folders. However I still notice a significant 10-20s of intensive scanning until it figures out what to ignore. With or without logging the ignored files it's still quite some time. Compared with 2-3 minutes prior it's way better. But compared to docker it's way slower. Docker can pull it off in 3-4 seconds from monorepo root. I can also see docker does not do any TCP traffic (no VM i guess, idk).

It seems that #21627 #20965 both talk about the same issue. I don't recall stumbling on them. Maybe I did but skimmed. Can't say for sure. Tbh also google could do a better job maybe.

Suggestions for improvement: Better logs I spent 3 days trying to debug this thing. I rarely give up on my debugs. Yesterday I was ready to completely switch to docker. I would describe myself as an expert in devops but I'm neither a junior. This problem got the best of me. I never thought of changing the build context. Again could be due to the fact that I only have 6 months of running stuff in containers. So my argument is that despite my past experiences I did not find enough clues in the logs to understand that the build context is the problem.

Side question - Why still stick with podman instead of docker? This entire experience made me question deeply my choice of using podman. This is not my first fight with podman.

Because this is my first time in the world of containers I chose podman mostly because of the heavy marketing about rootless and of the docker desktop license. Now looking back at the history of choices I wonder if the docker license is really such a big deal. Apparently docker also offers rootless. And anyway, when I deploy my Kubernetes cluster it wont use podman. So basically right now, after all these battle scars I have to question deeply my allegiance to podman. Can you provide a strong reason where podman is still the winner?

Thanks for the support!

eriksjolund commented 8 months ago

Speeding up podman build by using .containerignore or .dockerignore could be a useful performance tip for https://github.com/containers/podman/blob/main/docs/tutorials/performance.md

I started writing such a tip: https://github.com/eriksjolund/podman/blob/add-performance-tip-containerignore/docs/tutorials/performance.md

baude commented 8 months ago

Because this is my first time in the world of containers I chose podman mostly because of the heavy marketing about rootless and of the docker desktop license. Now looking back at the history of choices I wonder if the docker license is really such a big deal. Apparently docker also offers rootless. And anyway, when I deploy my Kubernetes cluster it wont use podman. So basically right now, after all these battle scars I have to question deeply my allegiance to podman. Can you provide a strong reason where podman is still the winner?

@adrian-moisa i read your write-ups. Apparently you had quite an experience. You should feel free to choose your container solution as you see fit; upstream Podman is largely development oriented and as such, you retain the choice. It is no secret that we have had several problems with our Mac implementation; some of which were well out of our control and some of which were bugs, etc. As you mention, or at least I assume you are referring to, upstream QEMU and then subsequently brew migrated to a newer QEMU where the firmware required for aarch64 was missing (or no present ... i forget frankly). I would say that is out of our hands. As for minikube, that is obviously a different project as well.

I can say that the future for AppleSilicon Macs looks bright for podman. We are currently wrapping up development for Podman 5, which no longer uses QEMU and will use Apple's native hypervisor. As such, we are also able to take advantage of virtiofs shares which is significantly more performant than QEMU and the 9P implementation.

As this issue stands, I'm not sure what you are asking of the Podman community and as such, it could theoreticaly get closed. I would recommend altering the title of the issue to reflect what you think is wrong or questionable. Also, you took the time to write this issue; perhaps you can take the next step and submit documentation or code that addresses problems? We love community submissions.

Whatever you choose, good luck with your container adventures!

github-actions[bot] commented 7 months ago

A friendly reminder that this issue had no activity for 30 days.

Luap99 commented 6 months ago

I am close this as dup of https://github.com/containers/podman/issues/21627

Also there have been quite a few contributions recently to make sending/extracting the build tarball faster