containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.46k stars 786 forks source link

devcontainer features are not cached #5243

Open likan999 opened 10 months ago

likan999 commented 10 months ago

Description When building devcontainer with features, Podman doesn't use the cached layers for features, which makes it very slow.

Steps to reproduce the issue:

  1. On remote Linux server, install Podman;
  2. On local machine, start VSCode, install Remote Development extensions pack;
  3. Modify VSCode setting so that it uses Podman instead of Docker:
    "dev.containers.dockerPath": "podman",
    "dev.containers.dockerComposePath": "podman-compose",

    Note: if the local server only has Docker installed (e.g. OSX with Docker Desktop), you may need to create shells script named podman that exec docker "$@", and podman-compose that exec docker-compose "$@". I don't know whether this is strictly needed, but I'll include it here just in case you encounter errors. Note: I think it should also reproducible if you just start VSCode on remote server, but I haven't tested this setup.

  4. In VSCode, connect to the remote server (e.g. using the Remote Explorer on the left side panel), select some folder, and then Dev Containers: Add Dev Container Configuration Files..., follow the steps to create a dev container. Make sure add some features when creating it. On my side, I chose 'Ubuntu', 'jammy' and then added 'Go' feature.
  5. Once it is done, in VSCode, do it again Dev Containers: Rebuild Container. And you will observe the layers are rebuilt again and not cached.

I've tried on OSX with Docker Desktop, the same command (except it uses docker) is run and the layers are cached when rebuilding:

podman buildx build --load --build-context dev_containers_feature_content_source=/tmp/devcontainercli-likan/container-features/0.54.1-1703879861696 --build-arg _DEV_CONTAINERS_BASE_IMAGE=mcr.microsoft.com/devcontainers/base:jammy --build-arg _DEV_CONTAINERS_IMAGE_USER=root --build-arg _DEV_CONTAINERS_FEATURE_CONTENT_SOURCE=dev_container_feature_content_temp --target dev_containers_target_stage -t vsc-develop-8218a1ef680c1bc93ab2500826039bac026343eea0536255833c9bb7072f0f37-features -f /tmp/devcontainercli-likan/container-features/0.54.1-1703879861696/Dockerfile.extended /tmp/devcontainercli-likan/empty-folder

The above command is just for reference and will not run on your environment.

Describe the results you received: The feature layer is not cached and needs a rebuild.

Describe the results you expected: The feature layer should be cached and no rebuild is needed.

Output of rpm -q buildah or apt list buildah:

Listing... Done
buildah/unknown 1.31.3-0ubuntu22.04+obs45.12 amd64
buildah/unknown 1.31.3-0ubuntu22.04+obs45.12 arm64
buildah/unknown 1.31.3-0ubuntu22.04+obs45.12 s390x

Output of buildah version:

Command 'buildah' not found, but can be installed with:
sudo apt install buildah

Output of podman version if reporting a podman build issue:

Client:       Podman Engine
Version:      4.6.2
API Version:  4.6.2
Go Version:   go1.18.1
Built:        Wed Dec 31 16:00:00 1969
OS/Arch:      linux/amd64

*Output of `cat /etc/release`:**

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.3 LTS"
PRETTY_NAME="Ubuntu 22.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.3 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

Output of uname -a:

Linux rs1 6.2.0-39-generic #40~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 16 10:53:04 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# This file is the configuration file for all tools
# that use the containers/storage library. The storage.conf file
# overrides all other storage.conf files. Container engines using the
# container/storage library do not inherit fields from other storage.conf
# files.
#
#  Note: The storage.conf file overrides other storage.conf files based on this precedence:
#      /usr/containers/storage.conf
#      /etc/containers/storage.conf
#      $HOME/.config/containers/storage.conf
#      $XDG_CONFIG_HOME/containers/storage.conf (If XDG_CONFIG_HOME is set)
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
# When changing the graphroot location on an SELINUX system, you must
# ensure  the labeling matches the default locations labels with the
# following commands:
# semanage fcontext -a -e /var/lib/containers/storage /NEWSTORAGEPATH
# restorecon -R -v /NEWSTORAGEPATH
graphroot = "/var/lib/containers/storage"

# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

# Transient store mode makes all container metadata be saved in temporary storage
# (i.e. runroot above). This is faster, but doesn't persist across reboots.
# transient_store = true

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
  "/mnt/mass/containers/podman/storage/root"
]

# Allows specification of how storage is populated when pulling images. This
# option can speed the pulling process of images compressed with format
# zstd:chunked. Containers/storage looks for files within images that are being
# pulled from a container registry that were previously pulled to the host.  It
# can copy or create a hard link to the existing file when it finds them,
# eliminating the need to pull them from the container registry. These options
# can deduplicate pulling of content, disk storage of content and can allow the
# kernel to use less memory when running containers.

# containers/storage supports four keys
#   * enable_partial_images="true" | "false"
#     Tells containers/storage to look for files previously pulled in storage
#     rather then always pulling them from the container registry.
#   * use_hard_links = "false" | "true"
#     Tells containers/storage to use hard links rather then create new files in
#     the image, if an identical file already existed in storage.
#   * ostree_repos = ""
#     Tells containers/storage where an ostree repository exists that might have
#     previously pulled content which can be used when attempting to avoid
#     pulling content from the container registry
pull_options = {enable_partial_images = "false", use_hard_links = "false", ostree_repos=""}

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev,metacopy=on"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather than the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
flouthoc commented 10 months ago

@likan999 The command which you have pasted in description, does it produces build without layers or is it only happening when vscode is trying to reach podman ?

likan999 commented 10 months ago

@flouthoc thanks for your quick response. I've attached the log file when I tried to re-run the command on remote server (but with --log-level=trace in case it helps). It looks like all previous steps are hit in the cache, but started to not use the cache for this command

[2/2] STEP 9/13: RUN --mount=type=bind,from=dev_containers_feature_content_source,source=node_0,target=/tmp/build-features-src/node_0     cp -ar /tmp/build-features-src/node_0 /tmp/dev-container-features  && chmod -R 0755 /tmp/dev-container-features/node_0  && cd /tmp/dev-container-features/node_0  && chmod +x ./devcontainer-features-install.sh  && ./devcontainer-features-install.sh  && rm -rf /tmp/dev-container-features/node_0

On OSX Docker Desktop the same RUN command is cached.

podman.log

github-actions[bot] commented 9 months ago

A friendly reminder that this issue had no activity for 30 days.

morgaesis commented 8 months ago

Can confirm that the same line is never cached for me.

### podman version
Client:       Podman Engine
Version:      4.8.3
API Version:  4.8.3
Go Version:   go1.20.12
Built:        Wed Jan  3 14:11:31 2024
OS/Arch:      linux/amd64

### /etc/*release
Fedora release 38 (Thirty Eight)
NAME="Fedora Linux"
VERSION="38.20240124.0 (Silverblue)"
ID=fedora
VERSION_ID=38
VERSION_CODENAME=""
PLATFORM_ID="platform:f38"
PRETTY_NAME="Fedora Linux 38.20240124.0 (Silverblue)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:38"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://silverblue.fedoraproject.org"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-silverblue/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://github.com/fedora-silverblue/issue-tracker/issues"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=38
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=38
SUPPORT_END=2024-05-14
VARIANT="Silverblue"
VARIANT_ID=silverblue
OSTREE_VERSION='38.20240124.0'
Fedora release 38 (Thirty Eight)
Fedora release 38 (Thirty Eight)
aaronlehmann commented 7 months ago

I wonder if #5445 helps with this.

zentasumu commented 5 months ago

5445 does not appear to fix the issue.

The feature layer is still built from scratch.

### /etc/*release

Fedora release 40 (Forty)
NAME="Fedora Linux"
VERSION="40.20240614.0 (Sway Atomic)"
ID=fedora
VERSION_ID=40
VERSION_CODENAME=""
PLATFORM_ID="platform:f40"
PRETTY_NAME="Fedora Linux 40.20240614.0 (Sway Atomic)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:40"
DEFAULT_HOSTNAME="fedora"
HOME_URL="https://fedoraproject.org/atomic-desktops/sway/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-sericea/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://gitlab.com/fedora/sigs/sway/SIG/-/issues"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=40
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=40
SUPPORT_END=2025-05-13
VARIANT="Sway Atomic"
VARIANT_ID=sway-atomic
OSTREE_VERSION='40.20240614.0'
Fedora release 40 (Forty)
Fedora release 40 (Forty)
### podman version

Client:       Podman Engine
Version:      5.1.0
API Version:  5.1.0
Go Version:   go1.22.3
Built:        Wed May 29 08:00:00 2024
OS/Arch:      linux/amd64
### podman info

host:
  arch: amd64
  buildahVersion: 1.36.0
  cgroupControllers:
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.10-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.10, commit: '
  cpuUtilization:
    idlePercent: 97.55
    systemPercent: 0.85
    userPercent: 1.6
  cpus: 16
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: sway-atomic
    version: "40"
  eventLogger: journald
  freeLocks: 2046
  hostname: fedora
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 524288
      size: 65536
  kernel: 6.8.11-300.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 2070827008
  memTotal: 16471666688
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-3.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.15-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240510.g7288448-1.fc40.x86_64
    version: |
      pasta 0^20240510.g7288448-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.2-2.fc40.x86_64
    version: |-
      slirp4netns version 1.2.2
      commit: 0ee2d87523e906518d34a6b423271e4826f71faf
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.5
  swapFree: 8588095488
  swapTotal: 8589930496
  uptime: 1h 43m 33.00s (Approximately 0.04 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io

...

version:
  APIVersion: 5.1.0
  Built: 1716940800
  BuiltTime: Wed May 29 08:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.3
  Os: linux
  OsArch: linux/amd64
  Version: 5.1.0
samsaket7 commented 4 months ago

Yup. Same for me. The feature layer is built from scratch.

zentasumu commented 3 days ago

I have instead been using DevPod's implementation of the devcontainer specification and it does re-use the cache with a podman provider setup when doing a workspace rebuild/reset.

That may be a possible trail to look into, just unsure at this point if the issue should be under buildah or the devcontainer extension/cli.

I will test again with the latest version of the devcontainer extension/cli and podman as it has been a while since I used such setup.