containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.07k stars 2.35k forks source link

podman excessively reads content of every file in the context directory on Windows. #20965

Closed gavenkoa closed 9 months ago

gavenkoa commented 9 months ago

Issue Description

When I run podman build on Windows 10 from the directory, containing other projects podman recursively reads all the subdirectories, with a big hierarchy is leads to minutes in delay.

bash# time podman build -t test -f test/Containerfile .
STEP 1/6: FROM node:20.9-slim
STEP 2/6: USER root
--> Using cache 08b6013ca03e34c15ade8d8dee0df20a86850333b70d4310ef3e38dd2df59d0f
--> 08b6013ca03e
STEP 3/6: ENTRYPOINT ["/bin/bash"]
--> Using cache eb883179fddac5d90ec68240c20aef19d738947b6b13bb73e44df9bbec7f0536
--> eb883179fdda
STEP 4/6: RUN mkdir /work
--> Using cache 6de7fba8c872c0e52f450772fd23861adf6da2537cabcc7d8f6061a81c48a86b
--> 6de7fba8c872
STEP 5/6: WORKDIR /work
--> Using cache 0e86de17aaed2c7c7a91be47ca08d6841cbf2efece46668f7788cda7e650bd58
--> 0e86de17aaed
STEP 6/6: COPY proj .
COMMIT test
--> 1475d186d920
Successfully tagged localhost/test:latest
1475d186d920fdf7b30a53a8cbfd5155a41dc018c9c7a9206223b6a8b9b9aedc

real    7m36.431s
user    0m0.016s
sys     0m0.031s

I see recursive behavior in procmon.exe, for every file it performs (and similar to every directory):

IRP_MJ_CREATE   C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Desired Access: Generic Read, Disposition: Open, Options: Synchronous IO Non-Alert, Attributes: R, ShareMode: Read, Write, AllocationSize: n/a, OpenResult: Opened  "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Type: QueryBasicInformationFile, CreationTime: 2023-10-12 10:50:01, LastAccessTime: 2023-12-10 15:23:26, LastWriteTime: 2023-10-12 10:50:01, ChangeTime: 2023-10-12 10:50:01, FileAttributes: A "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Type: QueryStandardInformationFile, AllocationSize: 80, EndOfFile: 73, NumberOfLinks: 1, DeletePending: False, Directory: False "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Offset: 0, Length: 73   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Offset: 0, Length: 73   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs END OF FILE Offset: 73, Length: 32,768  "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Type: QueryBasicInformationFile, CreationTime: 2023-10-12 10:50:01, LastAccessTime: 2023-12-10 15:23:26, LastWriteTime: 2023-10-12 10:50:01, ChangeTime: 2023-10-12 10:50:01, FileAttributes: A "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Type: QueryStandardInformationFile, AllocationSize: 80, EndOfFile: 73, NumberOfLinks: 1, DeletePending: False, Directory: False "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS Offset: 0, Length: 73   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_CLEANUP  C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_param.cjs SUCCESS     "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_CREATE   C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Desired Access: Generic Read, Disposition: Open, Options: Synchronous IO Non-Alert, Attributes: R, ShareMode: Read, Write, AllocationSize: n/a, OpenResult: Opened  "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Type: QueryBasicInformationFile, CreationTime: 2023-10-12 10:50:01, LastAccessTime: 2023-12-10 15:23:26, LastWriteTime: 2023-10-12 10:50:01, ChangeTime: 2023-10-12 10:50:01, FileAttributes: A "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Type: QueryStandardInformationFile, AllocationSize: 80, EndOfFile: 75, NumberOfLinks: 1, DeletePending: False, Directory: False "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Offset: 0, Length: 75   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Offset: 0, Length: 75   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    END OF FILE Offset: 75, Length: 32,768  "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Type: QueryBasicInformationFile, CreationTime: 2023-10-12 10:50:01, LastAccessTime: 2023-12-10 15:23:26, LastWriteTime: 2023-10-12 10:50:01, ChangeTime: 2023-10-12 10:50:01, FileAttributes: A "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_QUERY_INFORMATION    C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Type: QueryStandardInformationFile, AllocationSize: 80, EndOfFile: 75, NumberOfLinks: 1, DeletePending: False, Directory: False "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
FASTIO_READ C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS Offset: 0, Length: 75   "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .
IRP_MJ_CLEANUP  C:\Users\user\work\proj\node_modules\@swc\helpers\cjs\_ts_values.cjs    SUCCESS     "C:\opt\scoop\apps\podman\current\podman.exe"  build -t test -f test/Containerfile .

Steps to reproduce the issue

Run podman build -f some/Containerfile . from a directory with lots of files & directories (node_modules are good candidate xD).

Describe the results you received

Slow podman build execution.

Describe the results you expected

I expect podman build to be nearly instantaneous.

podman info output

host:
  arch: amd64
  buildahVersion: 1.32.0
  cgroupControllers:
  - cpuset
  - cpu
  - io
  - memory
  - hugetlb
  - pids
  - rdma
  - misc
  cgroupManager: cgroupfs
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.7-2.fc38.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.7, commit: '
  cpuUtilization:
    idlePercent: 99.65
    systemPercent: 0.23
    userPercent: 0.13
  cpus: 3
  databaseBackend: boltdb
  distribution:
    distribution: fedora
    variant: container
    version: "38"
  eventLogger: journald
  freeLocks: 2033
  hostname: msi
  idMappings:
    gidmap: null
    uidmap: null
  kernel: 5.15.123.1-microsoft-standard-WSL2
  linkmode: dynamic
  logDriver: journald
  memFree: 2013302784
  memTotal: 3054526464
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.8.0-1.fc38.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.8.0
    package: netavark-1.8.0-2.fc38.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.8.0
  ociRuntime:
    name: crun
    package: crun-1.9.2-1.fc38.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.9.2
      commit: 35274d346d2e9ffeacb22cc11590b0266a23d634
      rundir: /run/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20230908.g05627dc-1.fc38.x86_64
    version: |
      pasta 0^20230908.g05627dc-1.fc38.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: false
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.2.1-1.fc38.x86_64
    version: |-
      slirp4netns version 1.2.1
      commit: 09e31e92fa3d2a1d3ca261adaeb012c8d75a8194
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.3
  swapFree: 1072156672
  swapTotal: 1073741824
  uptime: 3h 4m 10.00s (Approximately 0.12 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - docker.io
store:
  configFile: /usr/share/containers/storage.conf
  containerStore:
    number: 10
    paused: 0
    running: 0
    stopped: 10
  graphDriverName: overlay
  graphOptions:
    overlay.mountopt: nodev,metacopy=on
  graphRoot: /var/lib/containers/storage
  graphRootAllocated: 269427478528
  graphRootUsed: 8990408704
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "true"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 47
  runRoot: /run/containers/storage
  transientStore: false
  volumePath: /var/lib/containers/storage/volumes
version:
  APIVersion: 4.7.0
  Built: 1695839078
  BuiltTime: Wed Sep 27 21:24:38 2023
  GitCommit: ""
  GoVersion: go1.20.8
  Os: linux
  OsArch: linux/amd64
  Version: 4.7.0

Podman in a container

No

Privileged Or Rootless

None

Upstream Latest Release

Yes

Additional environment details

Win 10 + WSL 2.

Additional information

I'm facing the #18840 so to include external to the build directory files I moved context up and unrelated directories are scanned by podman...

My final step is STEP 6/6: COPY test . - so podman shouldn't read any non "test" files.

gavenkoa commented 9 months ago

I want to restate that podman scans entire context directory hierarchy before is moves to any STEP! It is seen in the procmon.exe. Steps are printed to the screen much later (~5 min) including COPY command which might trigger the scan for files.

rhatdan commented 9 months ago

It is actually copying the entire context directory into the VM to be able to build the container. It can not figure out what is needed and not needed in the context directory until the Containerfile is processed. I believe you can use .containerignore or .dockerignore to cut down on the amount of files copied.

rhatdan commented 9 months ago

It would be much faster if you just moved all of the content required for the build into the test directory and then just ran the build from there.

gavenkoa commented 9 months ago

I suspected of such behavior as I saw intensive TCP communication.

The spec https://docs.docker.com/build/building/context/#filesystem-contexts

*What is a build context?** When your build context is a local directory, a remote Git repository, or a tar file, then that becomes the set of files that the builder can access during the build. A filesystem build context is processed recursively: When you specify a local directory or a tarball, all subdirectories are included

Local context This makes files and directories in the current working directory available to the builder. The builder loads the files it needs from the build context when needed.

omits description of this behavior, wording makes files and directories in the current working directory available to the builder is evasive - leaving the room for further optimization, like avoiding copying the data prematurely and only taking what is referenced in COPY / ADD...

gavenkoa commented 9 months ago

Anyway I faced the issue only because #18840 is not yet fixed.

As a workaround I'll make a cp or even tar workaround as .containerignore is an extra complication - need to understand the syntax & it is quite weak tool, while with tar I am familiar...

Does it make sense to update the docs? It looks like copying of context dir to a VM is WSL 2 specific operation... Will I face this enormous delay issue on Linux?

gavenkoa commented 9 months ago

ADD command understands TAR archives, the best way is to place a tar file to a build context context:

tar -zcf my.tar.gz --exclude=.git --exclude=node_modules -C ../ui-proj .
podman build --tag test .

and in Containerfile:

RUN mkdir -p /work
WORKDIR /work
ADD my.tar.gz .

Docker documentation also mentions that every file from build context is copied to the Docker server...