containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.36k stars 2.38k forks source link

Podman fail to autostart containers through quadlet/systemd, works when launched manually, error with pasta #22197

Open Froggy232 opened 6 months ago

Froggy232 commented 6 months ago

Issue Description

Hi, Since the upgrade to Fedora Silverblue 40 / Podman 5, systemd fail to launch containers at boot. If I try to launch them manually through systemctl --user start container.service, it works as expected. Thanks you!

Steps to reproduce the issue

Steps to reproduce the issue

  1. Automatize the gestion of container through quadlet / ~/.config/containers/systemd files
  2. Restart the server and see that containers failed to launch

Describe the results you received

Containers doesn't launch at boot, needs to be started manually

Describe the results you expected

Containers should start at boot.

podman info output

host:
  arch: amd64
  buildahVersion: 1.35.1
  cgroupControllers:
  - cpu
  - io
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.8-4.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.8, commit: '
  cpuUtilization:
    idlePercent: 99.37
    systemPercent: 0.21
    userPercent: 0.42
  cpus: 32
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: silverblue
    version: "40"
  eventLogger: journald
  freeLocks: 2047
  hostname: homeserver
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1020
      size: 1
    - container_id: 1
      host_id: 1703936
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1020
      size: 1
    - container_id: 1
      host_id: 1703936
      size: 65536
  kernel: 6.8.1-300.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 64334761984
  memTotal: 67334115328
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.10.0-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.10.0
    package: netavark-1.10.3-3.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.10.3
  ociRuntime:
    name: crun
    package: crun-1.14.4-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.14.4
      commit: a220ca661ce078f2c37b38c92e66cf66c012d9c1
      rundir: /run/user/1020/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240320.g71dd405-1.fc40.x86_64
    version: |
      pasta 0^20240320.g71dd405-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1020/podman/podman.sock
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 146028879872
  swapTotal: 146028879872
  uptime: 0h 14m 2.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
  - quay.io
store:
  configFile: /var/srv/media-server/.config/containers/storage.conf
  containerStore:
    number: 0
    paused: 0
    running: 0
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /srv/media-server/.local/share/containers/storage
  graphRootAllocated: 3999065440256
  graphRootUsed: 1034920087552
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 14
  runRoot: /run/user/1020/containers
  transientStore: false
  volumePath: /var/srv/media-server/.local/share/containers/storage/volumes
version:
  APIVersion: 5.0.0
  Built: 1710806400
  BuiltTime: Tue Mar 19 01:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.0
  Os: linux
  OsArch: linux/amd64
  Version: 5.0.0

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

No

Additional environment details

Fedora Silverblue 40 up-to-date

Additional information

Logs of a container :

mars 28 12:15:09 homeserver jellyfin[7039]: Error: pasta failed with exit code 1: mars 28 12:15:09 homeserver jellyfin[7039]: External interface not usable

Luap99 commented 6 months ago

You have to make sure your network is fully set up before the unit is started.

rhatdan commented 6 months ago

This feel like it could be related to the same question in https://github.com/containers/podman/pull/22057

flyingfishflash commented 6 months ago

I have not been able to get a rootless user quadlet to wait for my network to be ready even adding

[Unit]
wants=nss-online.target
after=nss-online.target

No issues on 4.9.3

Luap99 commented 6 months ago

@flyingfishflash You cannot wait for system units from user units, see https://github.com/systemd/systemd/issues/3312

I wasn't aware that the user units start before the network is fully set up and that it causes such big trouble with pasta. Note you do not need to downgrade, you can just change the default back to slirp4netns in containers.conf, see the last part in the pasta section on https://blog.podman.io/2024/03/podman-5-0-breaking-changes-in-detail/

You could also do something like this https://github.com/containers/podman/issues/22190#issuecomment-2027257771

Of course none of this is a proper solution but I am sure we will find something to address this in a better way soon.

flyingfishflash commented 6 months ago

@Luap99 - thank you for this tip re containers.conf!

gdonval commented 5 months ago

You could also do something like this #22190 (comment)

No. It's as much of a bad practice today as it was 50 years ago.

Klowner commented 5 months ago

I ran into this issue today and finally learned that systemd user level units apparently can't depend on system level units (such as network-online.target)

I've managed a workaround that satisfies my desire to avoid arbitrary timeouts by creating a user-level network-online.service and network-online.target

# ~/.config/systemd/user/network-online.service
[Unit]
Description=User-level proxy to system-level network-online.target

[Service]
type=oneshot
ExecStart=/bin/bash -c 'until systemctl --machine=%u@.host is-active network-online.target; do sleep 1; done'

[Install]
WantedBy=default.target
# ~/.config/systemd/user/network-online.target
[Unit]
Description=User-level network-online.target
Requires=network-online.service
Wants=network-online.service
After=network-online.service

Then in your quadlet units:

[Unit]
After=network-online.target
soiamsoNG commented 5 months ago

seems it just work after you can ping an external ip (include gateway ip)

djarbz commented 5 months ago

I'll share my workaround, but it might be a good idea to have a podman network --health command to verify by driver and network and such.

#[Unit]
Description=Wait for network to be online via NetworkManager or Systemd-Networkd

[Service]
# `nm-online -s` waits until the point when NetworkManager logs
# "startup complete". That is when startup actions are settled and
# devices and profiles reached a conclusive activated or deactivated
# state. It depends on which profiles are configured to autoconnect and
# also depends on profile settings like ipv4.may-fail/ipv6.may-fail,
# which affect when a profile is considered fully activated.
# Check NetworkManager logs to find out why wait-online takes a certain
# time.

Type=oneshot
# At least one of these should work depending if using NetworkManager or Systemd-Networkd
ExecStart=/bin/bash -c ' \
    if command -v nm-online &>/dev/null; then \
        nm-online -s -q; \
    elif command -v /usr/lib/systemd/systemd-networkd-wait-online &>/dev/null; then \
        /usr/lib/systemd/systemd-networkd-wait-online; \
    else \
        echo "Error: Neither nm-online nor systemd-networkd-wait-online found."; \
        exit 1; \
    fi'
ExecStartPost=ip -br addr
RemainAfterExit=yes

# Set $NM_ONLINE_TIMEOUT variable for timeout in seconds.
# Edit with `systemctl edit <THIS SERVICE NAME>`.
#
# Note, this timeout should commonly not be reached. If your boot
# gets delayed too long, then the solution is usually not to decrease
# the timeout, but to fix your setup so that the connected state
# gets reached earlier.
Environment=NM_ONLINE_TIMEOUT=60

[Install]
WantedBy=default.target
secext2022 commented 3 months ago

Another workaround:

We can copy network-online.target from system to user, with a little modify, like this:

$ cat /etc/systemd/user/network-online.target
[Unit]
Description=Network online for systemd --user
Documentation=man:systemd.special(7)
Documentation=https://systemd.io/NETWORK_ONLINE
#After=network.target

$ cat /etc/systemd/user/systemd-networkd-wait-online.service
[Unit]
Description=Wait network online for systemd --user
Documentation=man:systemd-networkd-wait-online.service(8)
Before=network-online.target

[Service]
Type=oneshot
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online
RemainAfterExit=yes

[Install]
WantedBy=network-online.target

or you can put these files to ~/.config/systemd/user for only one user.

Then enable the service as a user:

$ systemctl --user enable systemd-networkd-wait-online.service

Finally we can wait network online for podman, like this:

$ cat ~/.config/containers/systemd/my-app.container
[Unit]
Wants=network-online.target
After=network-online.target

reference link: https://unix.stackexchange.com/questions/216919/how-can-i-make-my-user-services-wait-till-the-network-is-online

WildPenquin commented 2 months ago

Hi,

Any idea for a workaround when using NetworkManager?

I tried to adapt @secext2022 's workaround, but the user service still "thinks" the Network is online approx. 7 seconds too early. I tried to change the parameter for nm-online by removing the -s, but the behavior is still the same.

dog /etc/systemd/user/network-online.target:

#  SPDX-License-Identifier: LGPL-2.1-or-later
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Network is Online
Documentation=man:systemd.special(7)
Documentation=https://systemd.io/NETWORK_ONLINE
# After=network.target

/etc/systemd/user/NetworkManager-wait-online.service:

[Unit]
Description=Network Manager Wait Online for Users
Documentation=man:NetworkManager-wait-online.service(8)
Requires=NetworkManager.service
After=NetworkManager.service
Before=network-online.target

[Service]
# `nm-online -s` waits until the point when NetworkManager logs
# "startup complete". That is when startup actions are settled and
# devices and profiles reached a conclusive activated or deactivated
# state. It depends on which profiles are configured to autoconnect and
# also depends on profile settings like ipv4.may-fail/ipv6.may-fail,
# which affect when a profile is considered fully activated.
# Check NetworkManager logs to find out why wait-online takes a certain
# time.

Type=oneshot
ExecStart=/usr/bin/nm-online -q
RemainAfterExit=yes

# Set $NM_ONLINE_TIMEOUT variable for timeout in seconds.
# Edit with `systemctl edit NetworkManager-wait-online`.
#
# Note, this timeout should commonly not be reached. If your boot
# gets delayed too long, then the solution is usually not to decrease
# the timeout, but to fix your setup so that the connected state
# gets reached earlier.
Environment=NM_ONLINE_TIMEOUT=60

[Install]
WantedBy=network-online.target

journalctl -b0 | grep Online:

Jul 17 12:43:09 archnuke systemd[1]: Starting Network Manager Wait Online...
Jul 17 12:43:09 archnuke systemd[706]: Reached target Network is Online.
Jul 17 12:43:16 archnuke systemd[1]: Finished Network Manager Wait Online.
Jul 17 12:43:16 archnuke systemd[1]: Reached target Network is Online.

The above is the system log, 12:43:09 is the user service. As the user running the podman container, LANG=C journalctl --user -b0 | grep Online:

Jul 17 12:43:09 archnuke systemd[706]: Reached target Network is Online.

Not sure why the NetworkManager-wait-online is not in the user log, it is enabled for the user:

systemctl --user status NetworkManager-wait-online.service 
○ NetworkManager-wait-online.service - Network Manager Wait Online for Users
     Loaded: loaded (/etc/xdg/systemd/user/NetworkManager-wait-online.service; enabled; preset: enabled)
     Active: inactive (dead)
       Docs: man:NetworkManager-wait-online.service(8)

As another workaround, I'm thinking for now adding to the Quadlet another dirty workaround: ExecStartPre=/bin/sh -c 'until ping -c1 google.com; do sleep 1; done;'

djarbz commented 2 months ago

I haven't used /etc/systemd/user, but my unit works, at least I haven't noticed an issue, when placed in ~/.config/Systemd/user.

secext2022 commented 2 months ago

Hi,

Any idea for a workaround when using NetworkManager?

I tried to adapt @secext2022 's workaround, but the user service still "thinks" the Network is online approx. 7 seconds too early. I tried to change the parameter for nm-online by removing the -s, but the behavior is still the same.

dog /etc/systemd/user/network-online.target:

#  SPDX-License-Identifier: LGPL-2.1-or-later
#
#  This file is part of systemd.
#
#  systemd is free software; you can redistribute it and/or modify it
#  under the terms of the GNU Lesser General Public License as published by
#  the Free Software Foundation; either version 2.1 of the License, or
#  (at your option) any later version.

[Unit]
Description=Network is Online
Documentation=man:systemd.special(7)
Documentation=https://systemd.io/NETWORK_ONLINE
# After=network.target

/etc/systemd/user/NetworkManager-wait-online.service:

[Unit]
Description=Network Manager Wait Online for Users
Documentation=man:NetworkManager-wait-online.service(8)
Requires=NetworkManager.service
After=NetworkManager.service
Before=network-online.target

[Service]
# `nm-online -s` waits until the point when NetworkManager logs
# "startup complete". That is when startup actions are settled and
# devices and profiles reached a conclusive activated or deactivated
# state. It depends on which profiles are configured to autoconnect and
# also depends on profile settings like ipv4.may-fail/ipv6.may-fail,
# which affect when a profile is considered fully activated.
# Check NetworkManager logs to find out why wait-online takes a certain
# time.

Type=oneshot
ExecStart=/usr/bin/nm-online -q
RemainAfterExit=yes

# Set $NM_ONLINE_TIMEOUT variable for timeout in seconds.
# Edit with `systemctl edit NetworkManager-wait-online`.
#
# Note, this timeout should commonly not be reached. If your boot
# gets delayed too long, then the solution is usually not to decrease
# the timeout, but to fix your setup so that the connected state
# gets reached earlier.
Environment=NM_ONLINE_TIMEOUT=60

[Install]
WantedBy=network-online.target

journalctl -b0 | grep Online:

Jul 17 12:43:09 archnuke systemd[1]: Starting Network Manager Wait Online...
Jul 17 12:43:09 archnuke systemd[706]: Reached target Network is Online.
Jul 17 12:43:16 archnuke systemd[1]: Finished Network Manager Wait Online.
Jul 17 12:43:16 archnuke systemd[1]: Reached target Network is Online.

The above is the system log, 12:43:09 is the user service. As the user running the podman container, LANG=C journalctl --user -b0 | grep Online:

Jul 17 12:43:09 archnuke systemd[706]: Reached target Network is Online.

Not sure why the NetworkManager-wait-online is not in the user log, it is enabled for the user:

systemctl --user status NetworkManager-wait-online.service 
○ NetworkManager-wait-online.service - Network Manager Wait Online for Users
     Loaded: loaded (/etc/xdg/systemd/user/NetworkManager-wait-online.service; enabled; preset: enabled)
     Active: inactive (dead)
       Docs: man:NetworkManager-wait-online.service(8)

As another workaround, I'm thinking for now adding to the Quadlet another dirty workaround: ExecStartPre=/bin/sh -c 'until ping -c1 google.com; do sleep 1; done;'

@WildPenquin

Please check this in the container service:

[Unit]
Wants=network-online.target
After=network-online.target
secext2022 commented 2 months ago
$ systemctl --user status my-app.service
● my-app.service - example deno/fresh app
     Loaded: loaded (/var/home/fc-test/.config/containers/systemd/my-app.container; generated)
    Drop-In: /usr/lib/systemd/user/service.d
             └─10-timeout-abort.conf
     Active: active (running) since Wed 2024-07-17 04:21:49 UTC; 20h ago
   Main PID: 2026 (conmon)
$ systemctl --user list-dependencies my-app
my-app.service
● ├─app.slice
● ├─basic.target
● │ ├─systemd-tmpfiles-setup.service
● │ ├─paths.target
● │ ├─sockets.target
● │ │ └─dbus.socket
● │ └─timers.target
● │   └─systemd-tmpfiles-clean.timer
● └─network-online.target
●   └─systemd-networkd-wait-online.service
WildPenquin commented 2 months ago

Hi @secext2022 ,

The Unit section is defined correctly.

As per my log, the problem is that NetoworkManager-wait-online user service finishes much too soon, much sooner that the system level one. I believe (meaning I'm not sure) that nm-online does not work correctly when run as a user (not designed to be run as a user?).

As yet another workaround, I've added ExecStartPre=/bin/sh -c 'until ping -c1 192.168.66.6; do sleep 1; done;' under [Service]. On the TODO list, I'm going to test if this works correctly if I change my interface to be managed by systemd-networkd with and use the systemd-networkd-wait-online service instead.

$ systemctl --user status pande-pmc.service

● pande-pmc.service - PandESportS MC-serveri
     Loaded: loaded (/home/minecraft/.config/containers/systemd/pande-pmc.container; generated)
     Active: active (running) since Fri 2024-07-19 16:04:56 EEST; 4min 55s ago
 Invocation: 9858022ff77a4dd38327d8c513324e7d
    Process: 829 ExecStartPre=/bin/sh -c until ping -c1 192.168.66.6; do sleep 1; done; (code=exited, status=0/SUCCESS)
   Main PID: 906 (conmon)
      Tasks: 82 (limit: 28525)
     Memory: 6.2G (peak: 6.2G)
        CPU: 1min 6.141s

$ systemctl --user list-dependencies pande-pmc.service

pande-pmc.service
● ├─app.slice
● ├─basic.target
● │ ├─paths.target
● │ ├─sockets.target
● │ │ ├─dbus.socket
● │ │ ├─dirmngr.socket
● │ │ ├─drkonqi-coredump-launcher.socket
● │ │ ├─gpg-agent-browser.socket
● │ │ ├─gpg-agent-extra.socket
● │ │ ├─gpg-agent-ssh.socket
● │ │ ├─gpg-agent.socket
● │ │ ├─keyboxd.socket
● │ │ ├─p11-kit-server.socket
● │ │ ├─pipewire-pulse.socket
● │ │ └─pipewire.socket
● │ └─timers.target
○ │   ├─drkonqi-coredump-cleanup.timer
○ │   └─drkonqi-sentry-postman.timer
● └─network-online.target
○   └─NetworkManager-wait-online.service

config/containers/systemd/pande-pmc.container:

[Unit]
Description=PandESportS MC-serveri

After=network-online.target
Wants=network-online.target

[Container]
AutoUpdate=registry
ContainerName=PandEPMC
Image=docker.io/gameservermanagers/gameserver:pmc
Volume=pandepmc:/data
LogDriver=k8s-file
PublishPort=25560:25560/tcp
PublishPort=25560:25560/udp
PodmanArgs=--log-opt=path=/home/minecraft/PandEPMClog.k8s
Timezone=local

[Service]
ExecStartPre=/bin/sh -c 'until ping -c1 192.168.66.6; do sleep 1; done;'
# Restart=always
Restart=no

[Install]
WantedBy=multi-user.target default.target
WildPenquin commented 2 months ago

After reading this thread and also the comments in https://github.com/systemd/systemd/issues/3312 , I think that thread has much cleaner workarounds than many of the ones in this thread. The problems with the workaround in here are that they are often quite long and convoluted for this relatively simple issue, and may or will break if the system configuration changes, as they are not agnostic on the configuration. But the systemd issue has much cleaner and simpler workarounds:

I haven't tested those, but they should work judging from the thumbs =).

I'm also starting to think maybe we should not be discussing workarounds here that much since it adds noise to actually solving the issue (which is: podman user containers should not fail at boot if networking is up). (As a general remark, no services should fail for whatever network error, but instead handle the situation, as network connections are unreliable. All these workaround should be unnecessary!).

I'm sorry for adding noise here myself, too =).

EDIT: My chosen workaround for the issue (cleanest in my opinion, less prone to break; I chose to name it check-network-online.service but it could be whatever you want it to be):

/etc/systemd/user/check-network-online.service:

[Unit]
Description=Check for system level network-online.target (for users)

[Service]
Type=oneshot
ExecStart=bash -c 'until systemctl is-active network-online.target; do sleep 1; done'
RemainAfterExit=yes

[Install]
WantedBy=default.target

Enable this service for the user. In badly behaving user services (such as podman quadlets), add:

After=check-network-online.service

Of course, YMMV!

sbrivio-rh commented 2 months ago

I'm also starting to think maybe we should not be discussing workarounds here that much since it adds noise to actually solving the issue

I personally don't find it distracting.

(which is: podman user containers should not fail at boot if networking is up). (As a general remark, no services should fail for whatever network error, but instead handle the situation, as network connections are unreliable. All these workaround should be unnecessary!).

The thing is, pasta(1) picks host addresses and routes by default. This is by design as it allows you to avoid (implicit) NAT altogether. If there's nothing there, it doesn't know what to pick, so it exits.

We're now considering to implement an optional netlink monitoring function that would dynamically create and delete routes and addresses as they come and go on the host, see also https://github.com/containers/podman/issues/22959#issuecomment-2228900989. That should be robust enough.

vrothberg commented 2 weeks ago

@Luap99 @rhatdan @ygalblum shall we update the quadlet docs to point that out?

Sitting in a meeting where this issue was brought up.

gdonval commented 2 weeks ago

If the doc said "Quadlets are currently broken. Please see that bug report XXX we have with systemd.", at the top in red and bold, I guess the situation would be improved tremendously. Acknowledging current limits and bugs is a big part of establishing trust with users.

As it is, users stumble across this again and again. I can't speak for the general industry but here, no one wants to hear about podman again for instance.

mhrivnak commented 2 weeks ago

Creating a unit for the user that runs until systemctl is-active network-online.target; do sleep 1; done does seem like a fairly simple and robust option that will be easy to see using normal systemctl commands and won't surprise anyone.

Could quadlet create such a unit automatically? That would be a big improvement to usability. It would also enable quadlet to adjust the implementation over time if systemd makes things easier. For example, maybe systemd will implement systemctl is-active --wait, which would be nice. Or maybe systemd will add a more direct way to solve this problem.

Options:

Automatic and Opt-Out: most containers that run as a service need a network, right? If quadlet generated a unit like this by default for every user container unit and added the After= line to the generated container unit, that would solve this problem for everyone. Perhaps some edge cases would need a way to opt-out?

Opt-In: If the UX was better as opt-in for some reason, I'd suggest a new setting in the [Unit] section such as AfterNetwork=true or similar.

I think it would not be a good idea to automatically look for the user to write After=network-online.target and translate that to something compatible with user units. It's a bad UX that systemd essentially ignores that rather than complain loudly to the user that they've declared something invalid or are depending on a unit that doesn't exist. Quadlet should not "magically" fix an incorrect unit.

BUT I think it is worth considering that quadlet could help the user by noticing that they wrote After=network-online.target in a user unit and failing with a helpful error message that shows them how to do it correctly. That would be incredibly helpful compared to the error message people see from pasta today.

vrothberg commented 2 weeks ago

Automatic and Opt-Out: most containers that run as a service need a network, right? If quadlet generated a unit like this by default for every user container unit and added the After= line to the generated container unit, that would solve this problem for everyone. Perhaps some edge cases would need a way to opt-out?

I like that proposal. Thanks for sharing, @mhrivnak !

Luap99 commented 2 weeks ago

Automatic and Opt-Out: most containers that run as a service need a network, right? If quadlet generated a unit like this by default for every user container unit and added the After= line to the generated container unit, that would solve this problem for everyone. Perhaps some edge cases would need a way to opt-out?

Quadlet already automatically adds After=network-online.target today.

The opt out is solved by using After= in the file which causes systemd to ignore all prior After= lines. You find that syntax described in the systemd docs.

Now of course network-online.target doesn't work rootless with current systemd versions so this is a NOP thus I see no problem changing that to our own functional network-online unit by default when running as user. Letting quadlet generate one doesn't seem to useful to me. We can ship the static unit file in the rpm as I don't think there is anything dynamic needed for that. All we need to to in quadlet is change the name in the user case.

BUT I think it is worth considering that quadlet could help the user by noticing that they wrote After=network-online.target in a user unit and failing with a helpful error message that shows them how to do it correctly. That would be incredibly helpful compared to the error message people see from pasta today.

This is not really true and helpful either. I have never experimented this is on my systems because network setup is much faster I guess. So if we now decide to error out we just break users that do not hit this race because their network config was fast enough.

But yes in general this problem should be documented.

ygalblum commented 2 weeks ago

Letting quadlet generate one doesn't seem to useful to me. We can ship the static unit file in the rpm as I don't think there is anything dynamic needed for that. All we need to to in quadlet is change the name in the user case

I'm not sure about that. Maybe I'm wrong here, but, don't you need a separate unit per user? Or is there a place to put units that all user units can point to?

mhrivnak commented 2 weeks ago

Automatic and Opt-Out: most containers that run as a service need a network, right? If quadlet generated a unit like this by default for every user container unit and added the After= line to the generated container unit, that would solve this problem for everyone. Perhaps some edge cases would need a way to opt-out?

Quadlet already automatically adds After=network-online.target today.

The opt out is solved by using After= in the file which causes systemd to ignore all prior After= lines. You find that syntax described in the systemd docs.

Now of course network-online.target doesn't work rootless with current systemd versions so this is a NOP thus I see no problem changing that to our own functional network-online unit by default when running as user. Letting quadlet generate one doesn't seem to useful to me. We can ship the static unit file in the rpm as I don't think there is anything dynamic needed for that. All we need to to in quadlet is change the name in the user case.

It needs to be installed for the specific user account, right? When would that happen if not at the time units are being generated and placed into the correct location for each user?

Maybe "generate" is the wrong word here, and it would just be a copy operation or even a symlink to some known location.

BUT I think it is worth considering that quadlet could help the user by noticing that they wrote After=network-online.target in a user unit and failing with a helpful error message that shows them how to do it correctly. That would be incredibly helpful compared to the error message people see from pasta today.

This is not really true and helpful either. I have never experimented this is on my systems because network setup is much faster I guess. So if we now decide to error out we just break users that do not hit this race because their network config was fast enough.

I think that most system admins, not to mention software engineers, would prefer to remove a race condition rather than depend on the presumption that they are likely to win the race most of the time.

That said, I see your point that if someone is currently winning the race every time, blissfully unaware that they're even competing in a race, it would not be a good experience to make their setup start failing. How about a loud log message at least so that if they ever do lose the race, or happen to look at the logs, they'll have an easier time understanding what happened, rather than have to google a weird error message from pasta?

But yes in general this problem should be documented.

mhrivnak commented 2 weeks ago

How about a loud log message at least so that if they ever do lose the race, or happen to look at the logs, they'll have an easier time understanding what happened, rather than have to google a weird error message from pasta?

Or maybe quadlet could add a comment in the generated unit.

[Unit]
Description=some network service
# The below statement has no effect since this is a user unit. But quadlet has preserved it for reference.
# Please see https://github.com/containers/podman/issues/22197 for details and solutions
# for how to properly depend on network startup
After=network-online.target
Luap99 commented 2 weeks ago

Letting quadlet generate one doesn't seem to useful to me. We can ship the static unit file in the rpm as I don't think there is anything dynamic needed for that. All we need to to in quadlet is change the name in the user case

I'm not sure about that. Maybe I'm wrong here, but, don't you need a separate unit per user? Or is there a place to put units that all user units can point to?

/usr/lib/systemd/user/ just like we already ship podman.service podman-auto-update.{timer,service},etc... This is a solved problem.

It needs to be installed for the specific user account, right? When would that happen if not at the time units are being generated and placed into the correct location for each user?

No see above, and the unit doesn't have to be enabled as long as quadlet adds Wants=our-new-unit it will triggered when your main unit will be started and does not need to be run when there are no quadlets at all.

I think that most system admins, not to mention software engineers, would prefer to remove a race condition rather than depend on the presumption that they are likely to win the race most of the time.

yes

That said, I see your point that if someone is currently winning the race every time, blissfully unaware that they're even competing in a race, it would not be a good experience to make their setup start failing. How about a loud log message at least so that if they ever do lose the race, or happen to look at the logs, they'll have an easier time understanding what happened, rather than have to google a weird error message from pasta?

Yes this gets tricky, we could have a log message for sure but this is also kinda log spam as quadlet as generator is run on every daemon reload and once a user has acknowledged, worked around we would still warn all the time which gets annoying quickly.

If the pasta error message doesn't make sense we should aim to fix that message to make sense. Either pasta itself should print something better or podman can catch it and print something better instead... Because throwing warnings when we do not know if it will even fails is just not nice. But once we know pasta failed we can print whatever you think is reasonable.

ygalblum commented 2 weeks ago

@Luap99 great, thanks for the clarification.

So, it seems that in terms of functionality the way to go is to add this service to the installation and add a dependency on it when generating rootless units. The dependency can be removed using a certain key. This will be added for all Quadlet types.

I think the only question left open is regarding the logging.

Right?

vrothberg commented 1 week ago

Shipping a unit directly with Podman sounds good to me, too 👍

vrothberg commented 1 week ago

As it is, users stumble across this again and again. I can't speak for the general industry but here, no one wants to hear about podman again for instance.

I hope you'll reconsider once this issue is fixed. Feel free to reach out if you want to chat.

vrothberg commented 1 week ago

@Luap99 @ygalblum can you confirm the proposed solution of shipping a systemd unit and adding that as a dependency for rootless Quadlets? I'd like to get this fixed soon to make sure it doesn't hit image mode.

Cc: @cgwalters @mrguitar @rhatdan

Luap99 commented 1 week ago

Yes sure that would be the idea, the more difficult to answer question is what the unit would look like as this would need to work for all users so I guess we need to use some form of systemctl is-active network-online.target loop. And then how do we name this? Should we use target + unit so user can hook other units into the target as well if needed?

vrothberg commented 1 week ago

Yes sure that would be the idea, the more difficult to answer question is what the unit would look like as this would need to work for all users so I guess we need to use some form of systemctl is-active network-online.target loop.

I think so, too. This solution is reported to work well by @mhrivnak et al.

And then how do we name this?

I would use something boring as podman-systemd-network-online.

Should we use target + unit so user can hook other units into the target as well if needed?

Do you have a use-case example in mind where a user might want to do that? I hope the unit/target won't be used outside of Quadlet.

gdonval commented 1 week ago

Nevermind that. There is something afoot but I can't reproduce.

sbrivio-rh commented 1 week ago

By the way, it would be nice if this solution can painlessly be removed once we have a robust workaround implemented by pasta (the netlink monitoring mechanism I'm currently working).

I'm not sure if there's anything specific that should be taken care of with that regard, I guess not, but I thought I'd mention this just in case.

gdonval commented 1 week ago

So, I can sort of reproduce... There is probably a race condition involved because it's not a problem every time:

image

This is not affecting the idea here, which is elegant and correct.

It's a problem that will occur though and the first time it happened, my machine had been up for nearly an hour. I also wonder if network-online.target is the right one of if network.target would be enough: does pasta need network or does it need the interfaces to be initialised?

cgwalters commented 1 week ago

I was tagged so I'll comment (only skimmed the issue); broadly agree that a single podman-systemd-network-online.target makes sense. But then the messy thing is who owns setting up the dependencies for that unit?

Today we have e.g.

$ grep WantedBy /usr/lib/systemd/system/NetworkManager-wait-online.service
WantedBy=network-online.target
$

But would it be NetworkManager that would drop in /usr/lib/systemd/user/NetworkManager-wait-online.service with WantedBy=podman-systemd-network-online.target?

Luap99 commented 1 week ago

I was tagged so I'll comment (only skimmed the issue); broadly agree that a single podman-systemd-network-online.target makes sense. But then the messy thing is who owns setting up the dependencies for that unit?

Today we have e.g.

$ grep WantedBy /usr/lib/systemd/system/NetworkManager-wait-online.service
WantedBy=network-online.target
$

But would it be NetworkManager that would drop in /usr/lib/systemd/user/NetworkManager-wait-online.service with WantedBy=podman-systemd-network-online.target?

I don't see a way we can get all the distro to do this realistically. Ideally systemd would juts expose the network-online.target to the user instance (https://github.com/systemd/systemd/issues/3312)

I think the easy way would be to just define a static unit (podman-systemd-network-online.service?) like the one described in https://github.com/containers/podman/issues/22197#issuecomment-2239409803 and then quadlet adds a After= and Wants= for the units instead of the non working network-online.target. Users could still overwrite this specific unit via the normal systemd config mechanisms (i.e. drop ins) if they want or need to.

vrothberg commented 1 week ago

I think the easy way would be to just define a static unit (podman-systemd-network-online.service?) like the one described in https://github.com/containers/podman/issues/22197#issuecomment-2239409803 and then quadlet adds a After= and Wants= for the units instead of the non working network-online.target.

That's also my understanding of it.