containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.8k stars 2.42k forks source link

Containers of new user cannot nslookup other containers after joining their networks #21360

Closed mzarnowski closed 9 months ago

mzarnowski commented 9 months ago

Issue Description

I have created a dedicated user for running podman containers. My setup is as follows:

When I connect the gateway to internal-container's network using my default ansible user, then it all works OK. When I run the ubuntu containers and networks as a dedicated user, nslookup fails with:

;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
Server:         192.168.0.1
Address:        192.168.0.1#53

Non-authoritative answer:
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: timed out
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
*** Can't find internal-container: No answer

The message varies depending on container, e.g on one based on 'alpine', it is:

nslookup: write to '10.89.1.1': Connection refused
nslookup: write to '10.89.0.1': Connection refused
Server:         192.168.0.1
Address:        192.168.0.1:53

** server can't find internal-container.dns.podman: NXDOMAIN

** server can't find internal-container.home: NXDOMAIN

** server can't find internal-container.dns.podman: NXDOMAIN

** server can't find internal-container.home: NXDOMAIN

I am running on RaspberryPi: Linux hostname 6.1.0-rpi7-rpi-v8 #1 SMP PREEMPT Debian 1:6.1.63-1+rpt1 (2023-11-24) aarch64 GNU/Linux

$ podman version
Client:       Podman Engine
Version:      4.3.1
API Version:  4.3.1
Go Version:   go1.19.8
Built:        Thu Jan  1 01:00:00 1970
OS/Arch:      linux/arm64```

### Steps to reproduce the issue

I have captured the issue in the following playbook
by switching the user variable from `PODMAN_USER` to `ansible_user` I can observe two different outcomes.

```yml
---
- name: Check if dedicated user's container can use dns to reach other containers
  hosts: dedicated_host
  vars:
    PODMAN_USER: poduser
    user: "{{ PODMAN_USER }}"
    # user: "{{ ansible_user }}"
  tasks:
    - name: Setup Podman
      block:
        - name: Create Podman user
          register: podman_user
          become: yes
          user:
            name: "{{ PODMAN_USER }}"
            create_home: yes
            password: "!" # disable password
            password_lock: yes
            state: present

        - name: Install Podman
          become: yes
          register: podman
          package:
            name: podman
            state: present
            update_cache: yes

    - name: Setup Gateway Container
      block:
        - name: Create Gateway Network
          # Default network doesn't allow us to connect to other conainers' networks
          become: yes
          become_user: "{{ user }}"
          containers.podman.podman_network:
            name: gateway-net
            internal: false
            state: present

        - name: Start Gateway Container
          become: yes
          become_user: "{{ user }}"
          containers.podman.podman_container:
            state: started
            image: "docker.io/library/ubuntu:latest"

            name:    gateway-container
            network: gateway-net
            cap_add: NET_RAW

            # keep the container running
            detach: true
            tty: true

    - name: Setup Internal Container
      block:
        - name: Create Internal Network
          # Default network doesn't allow us to connect to other conainers' networks
          become: yes
          become_user: "{{ user }}"
          containers.podman.podman_network:
            name: internal-net
            internal: true
            state: present

        - name: Start Internal Container
          become: yes
          become_user: "{{ user }}"
          register: internal_container # for checking if reachable vi
          containers.podman.podman_container:
            state: started
            image: "docker.io/library/ubuntu:latest"

            name:     internal-container
            network:  internal-net
            hostname: internal-container

            # keep the container running
            detach: true
            tty: true

    - name: Attach
      block:
        - name: Attach Gateway Container to Internal Network
          become: yes
          become_user: "{{ user }}"

          command:
            cmd: "podman network connect internal-net gateway-container"

        - name: Install ping and nslookup
          become: yes
          become_user: "{{ user }}"

          containers.podman.podman_container_exec:
            name: gateway-container
            command: "{{ item }}"
          loop:
            - "apt-get update"
            - "apt-get install iputils-ping dnsutils -y"

        - name: ping Internal Container
          become: yes
          become_user: "{{ user }}"
          register: ping

          containers.podman.podman_container_exec:                 
            name: gateway-container
            command: "ping -c 5 {{ internal_container.container.NetworkSettings.Networks['internal-net'].IPAddress }}"

        - name: nslookup Internal Container
          become: yes
          become_user: "{{ user }}"
          register: nslookup

          containers.podman.podman_container_exec:                 
            name: gateway-container
            command: "nslookup {{ internal_container.container.Config.Hostname }}"

        - name: Print output
          debug:
            msg: "{{ item }}"
          loop:
            - "{{ ping.podman_command }}"
            - "{{ ping.stdout_lines | join('\n') }}"
            - "{{ nslookup.podman_command }}"
            - "{{ nslookup.stdout_lines | join('\n') }}"

Describe the results you received

When running as ansible_user I get:

$ ping 10.89.3.2
PING 10.89.3.2 (10.89.3.2) 56(84) bytes of data.
64 bytes from 10.89.3.2: icmp_seq=1 ttl=64 time=0.167 ms
64 bytes from 10.89.3.2: icmp_seq=2 ttl=64 time=0.101 ms
64 bytes from 10.89.3.2: icmp_seq=3 ttl=64 time=0.059 ms
64 bytes from 10.89.3.2: icmp_seq=4 ttl=64 time=0.056 ms
64 bytes from 10.89.3.2: icmp_seq=5 ttl=64 time=0.136 ms

--- 10.89.3.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4098ms
rtt min/avg/max/mdev = 0.056/0.103/0.167/0.043 ms
$ nslookup internal-container
Server:           10.89.3.1
Address:        10.89.3.1#53

Non-authoritative answer:
Name:   internal-container.dns.podman
Address: 10.89.3.2

While running as PODMAN_USER I get:

$ ping 10.89.1.2
PING 10.89.1.2 (10.89.1.2) 56(84) bytes of data.
64 bytes from 10.89.1.2: icmp_seq=1 ttl=64 time=0.193 ms
64 bytes from 10.89.1.2: icmp_seq=2 ttl=64 time=0.099 ms
64 bytes from 10.89.1.2: icmp_seq=3 ttl=64 time=0.067 ms
64 bytes from 10.89.1.2: icmp_seq=4 ttl=64 time=0.088 ms
64 bytes from 10.89.1.2: icmp_seq=5 ttl=64 time=0.117 ms

--- 10.89.1.2 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4085ms
rtt min/avg/max/mdev = 0.067/0.112/0.193/0.043 ms
$ nslookup internal-container
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
Server:         192.168.0.1
Address:        192.168.0.1#53

Non-authoritative answer:
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.1.1#53: timed out
;; communications error to 10.89.1.1#53: connection refused
;; communications error to 10.89.0.1#53: connection refused
*** Can't find internal-container: No answer

Describe the results you expected

I would like the PODMAN_USER's output to be like the ansible_user's :)

I expected the gateway container to be able to nslookup the internal-container

podman info output

host:
  arch: arm64
  buildahVersion: 1.28.2
  cgroupControllers:
  - cpu
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon_2.1.6+ds1-1_arm64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.6, commit: unknown'
  cpuUtilization:
    idlePercent: 99.97
    systemPercent: 0.02
    userPercent: 0.01
  cpus: 4
  distribution:
    codename: bookworm
    distribution: debian
    version: "12"
  eventLogger: journald
  hostname: hostname
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
  kernel: 6.1.0-rpi7-rpi-v8
  linkmode: dynamic
  logDriver: journald
  memFree: 954097664
  memTotal: 1937920000
  networkBackend: netavark
  ociRuntime:
    name: crun
    package: crun_1.8.1-1+deb12u1_arm64
    path: /usr/bin/crun
    version: |-
      crun version 1.8.1
      commit: f8a096be060b22ccd3d5f3ebe44108517fbf6c30
      rundir: /run/user/1000/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  os: linux
  remoteSocket:
    path: /run/user/1000/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: /usr/bin/slirp4netns
    package: slirp4netns_1.2.0-1_arm64
    version: |-
      slirp4netns version 1.2.0
      commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
      libslirp: 4.7.0
      SLIRP_CONFIG_VERSION_MAX: 4
      libseccomp: 2.5.4
  swapFree: 104853504
  swapTotal: 104853504
  uptime: 71h 31m 21.00s (Approximately 2.96 days)
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/myuser/.config/containers/storage.conf
  containerStore:
    number: 2
    paused: 0
    running: 0
    stopped: 2
  graphDriverName: vfs
  graphOptions: {}
  graphRoot: /home/myuser/.local/share/containers/storage
  graphRootAllocated: 61715562496
  graphRootUsed: 14081503232
  graphStatus: {}
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 5
  runRoot: /run/user/1000/containers
  volumePath: /home/myuser/.local/share/containers/storage/volumes
version:
  APIVersion: 4.3.1
  Built: 0
  BuiltTime: Thu Jan  1 01:00:00 1970
  GitCommit: ""
  GoVersion: go1.19.8
  Os: linux
  OsArch: linux/arm64
  Version: 4.3.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

No response

Additional information

No response

mzarnowski commented 9 months ago

Turns out that the user created this way is not a lingering one, thus the dns service is not running between ssh sessions. Once I added the following, it started working

- name: Enable lingering
  # Required for dns service to keep running without active ssh session
  command: "loginctl enable-linger {{ PODMAN_USER }}"

Feel free to close,if not actionable, although would be nice to include this requirement in some FAQ

Luap99 commented 9 months ago

Maybe something changed but I would expect the containers to die as well when you log out: https://github.com/containers/podman/blob/main/troubleshooting.md#17-rootless-containers-exit-once-the-user-session-exits

So yes I would say this behaviour is normal as you have to stay logged in our user lingering otherwise systemd juts kills the user processes.