containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
24.02k stars 2.43k forks source link

stdout missing when running podman from systemd #4720

Closed bfallik closed 4 years ago

bfallik commented 4 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

I was experimenting converting an existing systemd+podman container to the approach described in Running containers with Podman and shareable systemd services.

I can start and stop the container and confirm the service is behaving as normal except for one gap. Logs written to STDOUT within the container are not being propagated to journald on the host.

Steps to reproduce the issue:

  1. systemctl start xxx.service

  2. journalctl -u xxx.service -n 100

Describe the results you received:

Dec 17 00:41:14 groot.lan.bfallik.net systemd[1]: Starting CoreDNS container...
Dec 17 00:41:14 groot.lan.bfallik.net podman[67500]: 2019-12-17 00:41:14.473225778 -0500 EST m=+0.139632601 container create e409db10a835719c26244d51996357c564128feddcfc633549d88401e91a9a60 (image=docker.io/coredns/coredns:latest, name=brave_hofstadter)
Dec 17 00:41:14 groot.lan.bfallik.net podman[67500]: 2019-12-17 00:41:14.685247813 -0500 EST m=+0.351654594 container init e409db10a835719c26244d51996357c564128feddcfc633549d88401e91a9a60 (image=docker.io/coredns/coredns:latest, name=brave_hofstadter)
Dec 17 00:41:14 groot.lan.bfallik.net podman[67500]: 2019-12-17 00:41:14.701066617 -0500 EST m=+0.367473397 container start e409db10a835719c26244d51996357c564128feddcfc633549d88401e91a9a60 (image=docker.io/coredns/coredns:latest, name=brave_hofstadter)
Dec 17 00:41:14 groot.lan.bfallik.net podman[67500]: e409db10a835719c26244d51996357c564128feddcfc633549d88401e91a9a60
Dec 17 00:41:14 groot.lan.bfallik.net systemd[1]: Started CoreDNS container.

Describe the results you expected:

Dec 16 20:37:06 xxx podman[1335]: [INFO] 10.100.10.149:14897 - 30903 "A IN prometheus.io. udp 31 false 512" NOERROR qr,rd,ra 89 0.104431243s

I can see the logs in podman:

# podman logs e409db10a835
.:53
CoreDNS-1.6.5
linux/amd64, go1.13.4, c2fd1b2
[INFO] 10.100.10.5:53466 - 20110 "A IN google.com. udp 51 false 4096" NOERROR qr,rd,ra 65 0.134158373s

Additional information you deem important (e.g. issue happens only occasionally):

I don't know if this is related by the article claims that podman starts several processes within the cgroup, but systemctl status only shows the conmon main process.

# systemctl status coredns-container.service
● coredns-container.service - CoreDNS container
   Loaded: loaded (/etc/systemd/system/coredns-container.service; enabled; vend>
   Active: active (running) since Tue 2019-12-17 00:41:14 EST; 1min 47s ago
 Main PID: 67591 (conmon)
    Tasks: 0 (limit: 9009)
   Memory: 2.5M
      CPU: 387ms
   CGroup: /system.slice/coredns-container.service
           ‣ 67591 /usr/bin/conmon --api-version 1 -s -c e409db10a835719c26244d>

Dec 17 00:41:14 xxx systemd[1]: Starting CoreDNS container...
Dec 17 00:41:14 xxx podman[67500]: 2019-12-17 00:41:14.473225>
Dec 17 00:41:14 xxx podman[67500]: 2019-12-17 00:41:14.685247>
Dec 17 00:41:14 xxx podman[67500]: 2019-12-17 00:41:14.701066>
Dec 17 00:41:14 xxx podman[67500]: e409db10a835719c26244d5199>
Dec 17 00:41:14 xxx systemd[1]: Started CoreDNS container.

Below is my unit file

[Unit]
Description=CoreDNS container

[Service]
Restart=on-failure
ExecStartPre=/usr/bin/rm -f /%t/%n-pid /%t/%n-cid
ExecStart=/usr/bin/podman run --conmon-pidfile /%t/%n-pid --cidfile /%t/%n-cid --volume=/var/lib/coredns:/root -p 10.100.10.5:53:53/udp --privileged --workdir /root --detach coredns/coredns -conf /root/Corefile
ExecStop=/usr/bin/sh -c "/usr/bin/podman rm -f `cat /%t/%n-cid`"
KillMode=none
Type=forking
PIDFile=/%t/%n-pid

[Install]
WantedBy=multi-user.target

Output of podman version:

# podman version
Version:            1.6.2
RemoteAPI Version:  1
Go Version:         go1.13.1
OS/Arch:            linux/amd64

Output of podman info --debug:

# podman info --debug
debug:
  compiler: gc
  git commit: ""
  go version: go1.13.1
  podman version: 1.6.2
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v2
  Conmon:
    package: conmon-2.0.2-1.fc31.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.2, commit: 186a550ba0866ce799d74006dab97969a2107979'
  Distribution:
    distribution: fedora
    version: "31"
  MemFree: 2897375232
  MemTotal: 7912034304
  OCIRuntime:
    name: crun
    package: crun-0.10.6-1.fc31.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 0.10.6
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +YAJL
  SwapFree: 8064069632
  SwapTotal: 8065642496
  arch: amd64
  cpus: 4
  eventlogger: journald
  hostname: groot.lan.bfallik.net
  kernel: 5.3.15-300.fc31.x86_64
  os: linux
  rootless: false
  uptime: 60h 34m 16.59s (Approximately 2.50 days)
registries:
  blocked: null
  insecure: null
  search:
  - docker.io
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - registry.centos.org
  - quay.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: overlay
  GraphOptions:
    overlay.mountopt: nodev,metacopy=on
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "false"
    Supports d_type: "true"
    Using metacopy: "true"
  ImageStore:
    number: 3
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.6.2-2.fc31.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

running on physical hardware

bfallik commented 4 years ago

After reading the escape-from-the-dock blog post I was inspired to try dropping the --detach podman flag and changing the unit file to type=simple and, voila, logging is back.

I can understand why type=forking may be incorrect for coredns since it probably doesn't fork but I'm confused how the top example from the blog post behaves differently. According to strace top doesn't fork, at least as tested on my Fedora host.

mheon commented 4 years ago

I don't actually know how type=forking works with a pidfile specified - but I know that the reason we moved away was because tracking podman run is not correct behavior, and can lead to VERY bad results.

Our containers are not direct children of Podman. During container startup, we launch an instance of Conmon (an extremely lightweight monitor process) which double-forks to daemonize and then launches the container as its own child. As a result of this, once the container is running, you can hit podman run with SIGKILL, and it will go away - but the container will still be running fine, in the background. Systemd doesn't know that, and also can't see it with cgroups (the container makes its own cgroups), so it interpreted the podman run process dying as the container dying, when it was still running - the states of Podman and systemd desynced, and nothing worked properly afterwards without manual intervention. The use of a pidfile might be able resolve this, but I'm not sure it will - we've never tested it that way.

bfallik commented 4 years ago

Interesting... thanks for sharing that.

github-actions[bot] commented 4 years ago

A friendly reminder that this issue had no activity for 30 days.

rhatdan commented 4 years ago

I think @mheon answered this one, Closing. Reopen if I am mistaken.

scolby33 commented 2 years ago

What is the current best practice for getting container output in the journal when running rootless containers as user units? With podman 3.0.1 on Debian 11, I see

In the latter case, I tried killing the podman run process and everything appeared to shut down cleanly (exit 0/SUCCESS in systemd and no pod sticking around either). The only problem was that the unit didn't restart because it's set to on-failure, but that's easily fixable.

mheon commented 2 years ago

@vrothberg PTAL

vrothberg commented 2 years ago

I do not recommend using type=simple for the reasons @mheon has mentioned above.

What is the current best practice for getting container output in the journal when running rootless containers as user units?

I do see the stdout in the journal. For instance, a container that does nothing more than echo 123:

[...] Jan 10 14:52:09 nebuchadnezzar conmon[163851]: 123 [...]

Are you not seeing that on your machine?

scolby33 commented 2 years ago

I was definitely not seeing output in the journal from configurations like those mentioned above. However, I have since changed my units to be like:

[Unit]
...

[Service]
...
ExecStart=/usr/bin/podman run --cidfile=... --cgroups=no-conmon --rm --pod-id-file ... --sdnotify=conmon -d --replace --log-driver=journald --name... ...
...
Type=notify
NotifyAccess=all

...

Things have been working as I would like. In particular, I think the --log-driver=journald was the solution here. Since I've kept -d and Type=notify I think this doesn't fall afoul of any of the problems mentioned above. Does this make sense or is there something incorrect with this configuration as well?