containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
24k stars 2.43k forks source link

ansible + podman not working #4813

Closed dschier-wtd closed 4 years ago

dschier-wtd commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I am using ansible to interact with podman via ansible podman connector. when using a rootless container, I am not able to execute any command via ansible and will get some errors. The command that should be executed is a simple cat /etc/hostname

This may not be 100% related to podman, but maybe to the ansible code (podman connector), since podman cp works in general.

Errors:

Error: cannot copy into running rootless container with pause set - pass --pause=false to force copying\\n

Steps to reproduce the issue:

  1. Start a container as user
podman run -d --rm --name instance fedora:31 sleep 300
  1. Execute some ansible
ansible instance -c podman -m setup -i inventory

instance = name of the container inventory file = a plain file with the word instance in it

Describe the results you received:

Error: cannot copy into running rootless container with pause set - pass --pause=false to force copying\\n

Describe the results you expected:

either a working execution or some hints, how to manipulate the "pause" behaviour.

Output of podman version:

Version:            1.6.2
RemoteAPI Version:  1
Go Version:         go1.13.1
OS/Arch:            linux/amd64

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.13.1
  podman version: 1.6.2
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v2
  Conmon:
    package: conmon-2.0.2-1.fc31.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.2, commit: 186a550ba0866ce799d74006dab97969a2107979'
  Distribution:
    distribution: fedora
    version: "31"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  MemFree: 9151815680
  MemTotal: 16602112000
  OCIRuntime:
    name: crun
    package: crun-0.10.6-1.fc31.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.10.6
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 8371826688
  SwapTotal: 8371826688
  arch: amd64
  cpus: 8
  eventlogger: journald
  hostname: nb01
  kernel: 5.4.7-200.fc31.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: slirp4netns-0.4.0-20.1.dev.gitbbd6f25.fc31.x86_64
    Version: |-
      slirp4netns version 0.4.0-beta.3+dev
      commit: bbd6f25c70d5db2a1cd3bfb0416a8db99a75ed7e
  uptime: 21h 3m 30.47s (Approximately 0.88 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /var/home/dschier/.config/containers/storage.conf
  ContainerStore:
    number: 4
  GraphDriverName: overlay
  GraphOptions:
    overlay.mount_program:
      Executable: /usr/bin/fuse-overlayfs
      Package: fuse-overlayfs-0.7.2-2.fc31.x86_64
      Version: |-
        fusermount3 version: 3.6.2
        fuse-overlayfs: version 0.7.2
        FUSE library version 3.6.2
        using FUSE kernel interface version 7.29
  GraphRoot: /var/home/dschier/.local/share/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 10
  RunRoot: /run/user/1000
  VolumePath: /var/home/dschier/.local/share/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-manpages-1.6.2-2.fc31.noarch
dschier-wtd commented 4 years ago

I have also opened a bug for the ansible devs

https://github.com/ansible/ansible/issues/66263

sshnaidm commented 4 years ago

@daniel-wtd I think it should have been fixed in https://github.com/containers/libpod/commit/d40b450afdc9784a3dcf0d5b95712f4ad8a46cc0 I can't currently reproduce it, for me it works with same podman and ansible versions. We have a little bit different builds https://linediff.com/?id=5e15a11d687f4b2e6e8b4567 but it doesn't seems as critical. Will look again what can be wrong

sshnaidm commented 4 years ago

@daniel-wtd please paste your "ansible --version"

dschier-wtd commented 4 years ago

Thanks for having a look. Below you can find my ansible version

ansible 2.9.2
  config file = None
  configured module search path = ['/var/home/dschier/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /var/home/dschier/.venv-python3/lib64/python3.7/site-packages/ansible
  executable location = /var/home/dschier/.venv-python3/bin/ansible
  python version = 3.7.5 (default, Dec 15 2019, 17:54:26) [GCC 9.2.1 20190827 (Red Hat 9.2.1-1)]
sshnaidm commented 4 years ago

@daniel-wtd thanks, also the same. Do you have a possibility to try it on a fresh system? I'll try to do the same today on f31

dschier-wtd commented 4 years ago

yep, will do this today in the evening (CET) and report back to you.

dschier-wtd commented 4 years ago

I have tested it a on 2 fresh installations minute ago:

and got the same result.

dschier-wtd commented 4 years ago

minor addition: on the same host running the exact same stuff as root is working perfectly fine.

sshnaidm commented 4 years ago

OK, I reproduced it on the fresh fedora31 system:

fedora@localhost ~]$ podman run -d --rm --name instance fedora:31 sleep 1h
9f894952cb740f2d7aae0ccb5e4e688dfe256f04835693acbbf86e103c3bc716
[fedora@localhost ~]$ podman cp /home/fedora/myfile instance:/root/
Error: cannot copy into running rootless container with pause set - pass --pause=false to force copying
[fedora@localhost ~]$ podman cp --pause=false /home/fedora/myfile instance:/root/
[fedora@localhost ~]$ podman version
Version:            1.6.2
RemoteAPI Version:  1
Go Version:         go1.13.1
OS/Arch:            linux/amd64

@mheon @rhatdan any ideas why it works on fedora30 with https://github.com/containers/libpod/commit/d40b450afdc9784a3dcf0d5b95712f4ad8a46cc0 but doesn't work on fedora31? With same podman version 1.6.2

mheon commented 4 years ago

CGroups v2, maybe? Pause requires CGroups which for rootless containers are only available on v2.

On Thu, Jan 9, 2020, 07:07 Sergey notifications@github.com wrote:

OK, I reproduced it on the fresh fedora31 system:

fedora@localhost ~]$ podman run -d --rm --name instance fedora:31 sleep 1h 9f894952cb740f2d7aae0ccb5e4e688dfe256f04835693acbbf86e103c3bc716 [fedora@localhost ~]$ podman cp /home/fedora/myfile instance:/root/ Error: cannot copy into running rootless container with pause set - pass --pause=false to force copying [fedora@localhost ~]$ podman cp --pause=false /home/fedora/myfile instance:/root/ [fedora@localhost ~]$ podman version Version: 1.6.2 RemoteAPI Version: 1 Go Version: go1.13.1 OS/Arch: linux/amd64

@mheon https://github.com/mheon @rhatdan https://github.com/rhatdan any ideas why it works on fedora30 with d40b450 https://github.com/containers/libpod/commit/d40b450afdc9784a3dcf0d5b95712f4ad8a46cc0 but doesn't work on fedora31? With same podman version 1.6.2

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containers/libpod/issues/4813?email_source=notifications&email_token=AB3AOCA7QDYQWUESTIJ3NG3Q44HQ5A5CNFSM4KEEZUH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIQCUOQ#issuecomment-572533306, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3AOCED23EI3FZ2FMKAAULQ44HQ5ANCNFSM4KEEZUHQ .

sshnaidm commented 4 years ago

@mheon yeah, seems like it's cgroups. It works fine with cgroups v1, but doesn't work with v2. Still not sure why it ignores default setting from https://github.com/containers/libpod/commit/d40b450afdc9784a3dcf0d5b95712f4ad8a46cc0 Should we set "--pause false" explicitly in ansible plugin? https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/connection/podman.py#L165-L166

CGroups v2, maybe? Pause requires CGroups which for rootless containers are only available on v2.

mheon commented 4 years ago

Might be a good idea.

mheon commented 4 years ago

@giuseppe We expect pause to work rootless on v2, correct?

giuseppe commented 4 years ago

@giuseppe We expect pause to work rootless on v2, correct?

yes, pause should work on cgroup v2 as rootless as well

sshnaidm commented 4 years ago

As I see in code pause=false is default only with cgroups v1 and it's what causes problems for fedora31. Is that a specific reason for it? https://github.com/containers/libpod/blob/f85b3a01f050e4f8af8471f396013f18647d9241/cmd/podman/cp.go#L496 @giuseppe @mheon ^^

mheon commented 4 years ago

It sounds like we expect pause to work for CGroups v2, so the fact that it's not is probably a bug

sshnaidm commented 4 years ago

@mheon so should we stop to require pause=False on cgroups v2 for rootless containers? If I set now explicitly --pause false in ansible module - will it break some higher/future versions..? Would be better to have it solved in podman maybe.

giuseppe commented 4 years ago

If pause can be performed in rootless depends whether we are using cgroups v2, the systemd manager and the freezer controller is available (it wasn't on older kernels for cgroup v2). The last condition can fail for root as well.

I think we should just drop the check and let the OCI runtime complain if pause cannot be performed.

giuseppe commented 4 years ago

PR here: https://github.com/containers/libpod/pull/4828

sshnaidm commented 4 years ago

@giuseppe thanks! Will it be ported to 1.6.2 or later? Because if I change ansible plugin to use "--pause false" now, it might be broken for later podman? Or I miss something?

giuseppe commented 4 years ago

@giuseppe thanks! Will it be ported to 1.6.2 or later? Because if I change ansible plugin to use "--pause false" now, it might be broken for later podman? Or I miss something?

I don't think this will be back ported as it is not a security problem. @mheon or we could backport it?

mheon commented 4 years ago

Probably not. Anything not long term should move rapidly to 1.7.0, while long term releases are on a critical security or bugfix basis.

On Mon, Jan 13, 2020, 04:05 Giuseppe Scrivano notifications@github.com wrote:

@giuseppe https://github.com/giuseppe thanks! Will it be ported to 1.6.2 or later? Because if I change ansible plugin to use "--pause false" now, it might be broken for later podman? Or I miss something?

I don't think this will be back ported as it is not a security problem. @mheon https://github.com/mheon or we could backport it?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/containers/libpod/issues/4813?email_source=notifications&email_token=AB3AOCFBKRDQG673NL4F5HTQ5QVGFA5CNFSM4KEEZUH2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIX66VQ#issuecomment-573566806, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3AOCG2CTNFD5XFY5FL2YLQ5QVGFANCNFSM4KEEZUHQ .

rhatdan commented 4 years ago

Yes no back porting, unless absolutely necessary.

samdoran commented 4 years ago

I merged the fix for this in Ansible (ansible/ansible#66583) and opened a backport to stable-2.9. It should be in the next 2.9 release.

dschier-wtd commented 4 years ago

with 1.8.0 on fedora, everything is working fine :)

rhatdan commented 4 years ago

Awesome.