containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
22.98k stars 2.34k forks source link

Stuck on Copying blob #8387

Closed xinredhat closed 3 years ago

xinredhat commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

/kind feature

Description I am using podman 2.1.1 on my Mac laptop, podman pull get stuck at Copying blob

Steps to reproduce the issue:

  1. podman pull brew.registry.redhat.io/rhmtc/openshift-migration-rhel7-operator@sha256:9b90ee8ad3c39df9f698164accc93720da9009db6a5f54be7aece258e353d578 --log-level=debug

Describe the results you received: INFO[0000] podman filtering at log level debug DEBU[0000] Called pull.PersistentPreRunE(podman pull brew.registry.redhat.io/rhmtc/openshift-migration-rhel7-operator@sha256:9b90ee8ad3c39df9f698164accc93720da9009db6a5f54be7aece258e353d578 --log-level=debug) DEBU[0000] public key signer enabled for identity "/Users/jiangjax/ocp/fedora-box/.vagrant/machines/default/virtualbox/private_key" DEBU[0000] Found SSH_AUTH_SOCK "/private/tmp/com.apple.launchd.BrdMUXnYzX/Listeners", ssh-agent signer enabled DEBU[0001] public key signer enabled for identity "/Users/jiangjax/ocp/fedora-box/.vagrant/machines/default/virtualbox/private_key" DEBU[0001] Found SSH_AUTH_SOCK "/private/tmp/com.apple.launchd.BrdMUXnYzX/Listeners", ssh-agent signer enabled INFO[0001] Setting parallel job count to 13 Trying to pull brew.registry.redhat.io/rhmtc/openshift-migration-rhel7-operator@sha256:9b90ee8ad3c39df9f698164accc93720da9009db6a5f54be7aece258e353d578... Getting image source signatures Copying blob sha256:dc0ebc51f8157103985596897d0652db6d185120487213322ace8466605919c3 Copying blob sha256:64dc44d45bc603b07167e2f08247bd15077c8d76c344919e0375ad2eb9a93892 Copying blob sha256:2bd25ca124579d6fce8668ff5d4ed83866d7e7438cb561a51ddde8cc40272822 Copying blob sha256:1323a241cc068f2816dd88f00168be73339471d6dc6eb2e6c761b63b734501b6 Copying blob sha256:ad2b36da9b5725e7721c2613087e97b28a6c7d6b487df8104d32e453c93f8b15 Copying blob sha256:e68bdcb9ee3fd3039b05f94d93dd48bd62f4eb9c87a9237bce27af00b2c535f3

Describe the results you expected: I would expect it pull down the image

Additional information you deem important (e.g. issue happens only occasionally): In order to use the podman, I installed vagrant to bring up a virtual machine as podman server. $ cat Vagrantfile Vagrant.configure("2") do |config| config.vm.box = "fedora/32-cloud-base"

config.vm.provider "virtualbox" do |vb| vb.memory = "2048" end

config.vm.provision "shell", privileged: false, inline: <<-SHELL sudo yum install -y podman

systemctl enable --user podman.socket
systemctl start --user podman.socket

SHELL end

Output of podman version:

$ podman version                                                                                                                                                            1 ↵
Client:
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.15.5
Built:        Mon Nov 16 05:22:13 2020
OS/Arch:      darwin/amd64

Server:
Version:      2.1.1
API Version:  2.0.0
Go Version:   go1.14.9
Built:        Thu Oct  1 03:31:11 2020
OS/Arch:      linux/amd64

Output of podman info --debug:

$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.21-2.fc32.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 81d18b6c3ffc266abdef7ca94c1450e669a6a388'
  cpus: 1
  distribution:
    distribution: fedora
    version: "32"
  eventLogger: journald
  hostname: ibm-p8-kvm-03-guest-02.virt.pnr.lab.eng.rdu2.redhat.com
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.6.6-300.fc32.x86_64
  linkmode: dynamic
  memFree: 1526751232
  memTotal: 2075815936
  ociRuntime:
    name: crun
    package: crun-0.15.1-1.fc32.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.15.1
      commit: eb0145e5ad4d8207e84a327248af76663d4e50dd
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-1.fc32.x86_64
    version: |-
      slirp4netns version 1.1.4
      commit: b66ffa8e262507e37fca689822d23430f3357fe8
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 2
  swapFree: 0
  swapTotal: 0
  uptime: 11m 30.67s
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - docker.io
store:
  configFile: /home/vagrant/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.2.0-1.fc32.x86_64
      Version: |-
        fusermount3 version: 3.9.1
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.9.1
        using FUSE kernel interface version 7.31
  graphRoot: /home/vagrant/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 1
  runRoot: /run/user/1000/containers
  volumePath: /home/vagrant/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1601494271
  BuiltTime: Wed Sep 30 19:31:11 2020
  GitCommit: ""
  GoVersion: go1.14.9
  OsArch: linux/amd64
  Version: 2.1.1

Package info (e.g. output of rpm -q podman or apt list podman):

(paste your output here)

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide?

Yes

Additional environment details (AWS, VirtualBox, physical, etc.):

topas-rec commented 3 years ago

Is this a duplicate of https://github.com/containers/podman/issues/7963?

xinredhat commented 3 years ago

no, this was not run into pull loop. And the printout didn't see Writing manifest to image destination. It seems the Copying blob not finished yet

mheon commented 3 years ago

@vrothberg @mtrmac PTAL

vrothberg commented 3 years ago

Thanks for reaching out, @XinRedhat.

Can you share the debug logs from the podman server? If you do a podman images in another terminal, do you see the image?

xinredhat commented 3 years ago

@vrothberg I did see nothing else in anther terminal

$ podman images                                                                                                                                                                                                         
REPOSITORY  TAG     IMAGE ID  CREATED  SIZE

the debug logs from the poman server.

$ vagrant ssh
Last login: Mon Nov 16 11:18:48 2020 from 10.0.2.2
[vagrant@ibm-p8-kvm-03-guest-02 ~]$ podman info --debug
host:
  arch: amd64
  buildahVersion: 1.16.1
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.0.21-2.fc32.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.21, commit: 81d18b6c3ffc266abdef7ca94c1450e669a6a388'
  cpus: 1
  distribution:
    distribution: fedora
    version: "32"
  eventLogger: journald
  hostname: ibm-p8-kvm-03-guest-02.virt.pnr.lab.eng.rdu2.redhat.com
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 5.6.6-300.fc32.x86_64
  linkmode: dynamic
  memFree: 114708480
  memTotal: 1020424192
  ociRuntime:
    name: crun
    package: crun-0.15.1-1.fc32.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.15.1
      commit: eb0145e5ad4d8207e84a327248af76663d4e50dd
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    exists: true
    path: /run/user/1000/podman/podman.sock
  rootless: true
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns-1.1.4-1.fc32.x86_64
    version: |-
      slirp4netns version 1.1.4
      commit: b66ffa8e262507e37fca689822d23430f3357fe8
      libslirp: 4.3.1
      SLIRP_CONFIG_VERSION_MAX: 2
  swapFree: 0
  swapTotal: 0
  uptime: 72h 17m 32.45s (Approximately 3.00 days)
registries:
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
store:
  configFile: /home/vagrant/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-1.2.0-1.fc32.x86_64
      Version: |-
        fusermount3 version: 3.9.1
        fuse-overlayfs: version 1.1.0
        FUSE library version 3.9.1
        using FUSE kernel interface version 7.31
  graphRoot: /home/vagrant/.local/share/containers/storage
  graphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  imageStore:
    number: 0
  runRoot: /run/user/1000/containers
  volumePath: /home/vagrant/.local/share/containers/storage/volumes
version:
  APIVersion: 2.0.0
  Built: 1601494271
  BuiltTime: Wed Sep 30 19:31:11 2020
  GitCommit: ""
  GoVersion: go1.14.9
  OsArch: linux/amd64
  Version: 2.1.1
mtrmac commented 3 years ago

This might require a (Golang) stack trace, or possibly even a system call trace. Apparently sending SIGQUIT to the process could trigger an abort with a stack dump, that would be a start.

github-actions[bot] commented 3 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 3 years ago

@XinRedhat Could you give @mtrmac the trace information?

xinredhat commented 3 years ago

@rhatdan could you please let me know how to print out the trace information?

rhatdan commented 3 years ago

@mtrmac ^^

mtrmac commented 3 years ago

Apparently sending SIGQUIT to the process could trigger an abort with a stack dump, that would be a start.

Also, see strace / dtruss.


Wait, this is a Mac remote client to a Linux server? In that case I have no idea how to trace the server/client configuration to determine which one has hung, that’s a Podman question.

vrothberg commented 3 years ago

@XinRedhat, can you try with a more recent version of Podman (e.g., v2.2.1)?

vrothberg commented 3 years ago

Closing. Feel free to reopen when we have more information.