voxpupuli / puppet-systemd

Puppet module to manage systemd
https://forge.puppet.com/puppet/systemd
Apache License 2.0
52 stars 141 forks source link

systemd::user_service broken for puppet running in background #459

Open traylenator opened 2 months ago

traylenator commented 2 months ago

Affected Puppet, Ruby, OS and module versions/distributions

How to reproduce (e.g Puppet code you use)

systemd::user_service { "gitlab-podman-auto-update.timer":
  ensure  => true,
  enable  => true,
  unit    => podman-auto-update.timer,                            
  user    => 'gitlab-runner',
}

Run puppet agent in background, i.e pkill -SIGUSR1 puppet and not puppet agent -t -v

What are you seeing

The service does not start, the type fails to check the status of the service assumes it is not running and then fails to start it. ( In fact the service is running however that is some what immaterial if puppet can't detect that.

What behaviour did you expect instead

The service should be started and enabled enabled correctly.

Output log

Apr 29 19:38:46 cirunner01.example.ch puppet-agent[1816569]: (/Stage[main]/Hg_punch::Cirunners/Systemd::User_service[gitlab-runner-podman-auto-update.timer]/Exec[Start user service podman-auto-update.timer for user gitlab-runner]/returns) Failed to start transient service unit: Transport endpoint is not connected
Apr 29 19:38:46 cirunner01.example.ch puppet-agent[1816569]: '["systemd-run", "--pipe", "--wait", "--user", "--machine", "gitlab-runner@.host", "systemctl", "--user", "start", "podman-auto-update.timer"]' returned 1 instead of one of [0]
Apr 29 19:38:46 cirunner01.example.ch puppet-agent[1816569]: (/Stage[main]/Hg_punch::Cirunners/Systemd::User_service[gitlab-runner-podman-auto-update.timer]/Exec[Start user service podman-auto-update.timer for user gitlab-runner]/returns) change from 'notrun' to ['0'] failed: '["systemd-run", "--pipe", "--wait", "--user", "--machine", "gitlab-runner@.host", "systemctl", "--user", "start", "podman-auto-update.timer"]' returned 1 instead of one of [0] (corrective)

Note the error:

 Failed to start transient service unit: Transport endpoint is not connected

Any additional information you'd like to impart

Expanding the underlying exec it is:

systemd-run --pipe --wait --user --machine gitlab-runner@.host  systemctl --user   is-active podman-auto-update.timer                                                  

This command is fine from the cmd line:

# systemd-run --pipe --wait --user --machine gitlab-runner@.host  systemctl --user   is-active podman-auto-update.timer                                                  
Running as unit: run-u10229.service
active
Finished with result: success
Main processes terminated with: code=exited/status=0
Service runtime: 7ms
CPU time consumed: 6ms

Just will not run in background puppet exec.

traylenator commented 2 months ago

The full SYSTEMD_LOG_LEVEL=debug enabled log is

podman auto update for test-user]/returns) Bus n/a: changing state UNSET → OPENING
podman auto update for test-user]/returns) sd-bus: starting bus with systemd-run -M.host -PGq --wait -pUser=test -pPAMName=login systemd-stdio-bridge "-punix:path=\${XDG_RUNTIME_DIR}/bus"
podman auto update for test-user]/returns) Successfully forked off '(sd-busexec)' as PID 7041.
podman auto update for test-user]/returns) Bus n/a: changing state OPENING → AUTHENTICATING
podman auto update for test-user]/returns) Bus n/a: changing state UNSET → OPENING
podman auto update for test-user]/returns) sd-bus: starting bus by connecting to /run/dbus/system_bus_socket...
podman auto update for test-user]/returns) Bus n/a: changing state OPENING → AUTHENTICATING
podman auto update for test-user]/returns) Bus n/a: changing state AUTHENTICATING → HELLO
podman auto update for test-user]/returns) Sent message type=method_call sender=n/a destination=org.freedesktop.DBus path=/org/freedesktop/DBus interface=org.freedesktop.DBus member=Hello cookie=1 reply_cookie=0 signature=n/a error-name=n/a error-message=n/a
podman auto update for test-user]/returns) Got message type=method_return sender=org.freedesktop.DBus destination=:1.113 path=n/a interface=n/a member=n/a cookie=4294967295 reply_cookie=1 signature=s error-name=n/a error-message=n/a
podman auto update for test-user]/returns) Bus n/a: changing state HELLO → RUNNING
podman auto update for test-user]/returns) Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartTransientUnit cookie=2 reply_cookie=0 signature=ssa(sv)a(sa(sv)) error-name=n/a error-message=n/a
podman auto update for test-user]/returns) Bus n/a: changing state RUNNING → CLOSING
podman auto update for test-user]/returns) Failed to start transient service unit: Connection reset by peer
podman auto update for test-user]/returns) Bus n/a: changing state CLOSING → CLOSED
podman auto update for test-user]/returns) Bus n/a: changing state AUTHENTICATING → CLOSING
podman auto update for test-user]/returns) Failed to start transient service unit: Transport endpoint is not connected
podman auto update for test-user]/returns) Bus n/a: changing state CLOSING → CLOSED
"--pipe", "--wait", "--user", "--machine", "test@.host", "systemctl", "--user", "start", "podman-auto-update.timer"]' returned 1 instead of one of [0]
podman auto update for test-user]/returns) change from 'notrun' to ['0'] failed: '["systemd-run", "--pipe", "--wait", "--user", "--machine", "test@.host", "systemctl", "--user", "start", "podman-auto-update.timer"]' returned 1 instead of one of [0] (corrective)
traylenator commented 2 months ago
Trying newer systemd and different puppet versions: Agent OS Agent Version Server Version systemd version pass/fail
EL9 7.24 7.24 252 fails (as above)
Fedora 40 8.5.1 8.6.0 255 passes
Fedora 40 7.30 8.6.0 255 passes
EL9 8.6.0 8.6.0 252 fails

Looks to be something fixed between systemd 252 and 255.

traylenator commented 2 months ago

Unfortunately upgrading systemd from source to 256~rc1 and reboot on EL9 does not resolve the situation.