Systemd notify - PID not permitted

DChalcraft commented 1 year ago

Ruby version: 3.1.2p20 Rails version: 7.0.4.2 Sidekiq / Pro / Enterprise version(s): sidekiq-7.0.6 / sidekiq-pro-7.0.7 / sidekiq-ent-7.0.5

sidekiq.yml

:concurrency: <%= ENV["RAILS_MAX_THREADS"]&.to_i || 5 %>
:queues:
  - high
  - default
  - low

sudo sytemctl status sidekiq.service

...
● sidekiq.service - sidekiq
     Loaded: loaded (/etc/systemd/system/sidekiq.service; enabled; vendor preset: enabled)
     Active: activating (start) since Mon 2023-03-13 07:50:08 UTC; 27s ago
   Main PID: 3788898 (einhorn)
      Tasks: 11 (limit: 9362)
     Memory: 258.4M
        CPU: 5.079s
     CGroup: /system.slice/sidekiq.service
             ├─3788898 "einhorn: /usr/local/bin/bundle exec sidekiq -e staging -c 1" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "">
             └─3788908 "sidekiq 7.0.6 REDACTED [0 of 1 busy]" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ""

Mar 13 07:50:09 staging1-1 sidekiq[3788908]: [WORKER 3788908] INFO: About to exec ["/usr/local/bin/bundle", "exec", "sidekiq", "-e", "staging", "-c", "1"]
Mar 13 07:50:09 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:09.884Z pid=3788908 tid=295p8 INFO: Enabling systemd notification integration
Mar 13 07:50:11 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:11.471Z pid=3788908 tid=295p8 INFO: Filesystem mounts are ok!
Mar 13 07:50:14 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:14.032Z pid=3788908 tid=295p8 INFO: Booted Rails 7.0.4.2 application in staging environment
Mar 13 07:50:14 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:14.033Z pid=3788908 tid=295p8 INFO: Running in ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
Mar 13 07:50:14 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:14.033Z pid=3788908 tid=295p8 INFO: Sidekiq Pro 7.0.7 / Sidekiq Enterprise 7.0.5, commercially licensed.
Mar 13 07:50:14 staging1-1 systemd[1]: sidekiq.service: Got notification message from PID 3788908, but reception only permitted for main PID 3788898
Mar 13 07:50:14 staging1-1 sidekiq[3788908]: 2023-03-13T07:50:14.037Z pid=3788908 tid=2buv8 INFO: SuperFetch[default] activated
Mar 13 07:50:14 staging1-1 sidekiq[3788898]: [MASTER 3788898] INFO: Received a manual ACK from 3788908
Mar 13 07:50:14 staging1-1 sidekiq[3788898]: [MASTER 3788898] INFO: Up to 1 / 1 manual ACKs
...

sidekiq.service

# This file corresponds to a single Sidekiq process.  Add multiple copies
# to run multiple processes (sidekiq-1, sidekiq-2, etc).
#
# Use `journalctl -u sidekiq -rn 100` to view the last 100 lines of log output.
#
[Unit]
Description=sidekiq
# start us only once the network and logging subsystems are available
After=syslog.target network.target nfs-server.service
RequiresMountsFor=/redacted /redacted

[Service]
Type=simple

WorkingDirectory=/**/**/**
ExecStart=/usr/local/bin/einhorn -m manual --reexec-as=/**/**/**/einhorn-redacted -- \
  /usr/local/bin/bundle exec sidekiq -e {{rails_env}} -c {{sidekiq_worker_count}}
ExecReload=/usr/local/bin/einhornsh --execute upgrade

# Use `systemctl kill -s TSTP sidekiq` to quiet the Sidekiq process
# (However, note that `systemctl reload sidekiq` is preferable for rolling restarts)

Environment=RAILS_ENV={{rails_env}}
Environment=RAILS_LOG_TO_STDOUT=1
Environment=BOOM_DB_READ_TIMEOUT=120
Environment=MALLOC_ARENA_MAX=2

Environment=LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so

Environment=RAILS_MAX_THREADS=20

User=redacted
Group=redacted
UMask=redacted

RestartSec=1
Restart=on-failure

StandardOutput=journal
StandardError=journal

# This will default to "bundler" if we don't specify it
SyslogIdentifier=sidekiq

[Install]
WantedBy=multi-user.target

Possibly due to systemd expecting the notify to come from the Einhorn process rather than Sidekiq. Attemped to set NotifyAccess=all in sidekiq.service, but results in systemd killing it every 10 seconds.

mperham commented 1 year ago

Interesting. Why are you using einhorn without sidekiqswarm?

DChalcraft commented 1 year ago

Why are you using einhorn without sidekiqswarm?

We only have a single process on each server that's running sidekiq so not sure sidekiqswarm offers any obvious benefits. The documentation suggests that rolling restarts should work with either, which is why we are using einhorn with sidekiq.

mperham commented 1 year ago

Ok. I'm not sure what's wrong. I ask because sidekiqswarm has specific code to implement the systemd watchdog for its children and no one else has had a problem with sidekiqswarm's systemd integration recently. If possible I would set SIDEKIQ_COUNT=1, switch to sidekiqswarm and see if that works. If it doesn't, that's good info for debugging the underlying issue.

jdelStrother commented 1 year ago

Following a bit more investigation:

I think the only way systemd's Type=Notify is going to work with einhorn is if we've also got NotifyAccess=all in the service file, since otherwise systemd only accepts notifications from the main pid (einhorn).

Once NotifyAccess=all is added, the initial READY notification works. However, we also have WatchdogSec=10 in the service file.

sidekiqswarm actually handles this ok: it calls Sidekiq.start_watchdog as long as ENV["NOTIFY_SOCKET"] is set.

However, regular non-swarm sidekiq calls Sidekiq.start_watchdog if Sidekiq::SdNotify.watchdog?. That returns false because ENV["WATCHDOG_PID"] isn't the current pid (it's the einhorn pid). So no watchdog notifications are sent, and systemd kills einhorn+sidekiq after the 10 second timeout.

For now I've added Sidekiq.configure_server { Sidekiq.start_watchdog } to an initializer, which works around that.

I'm still struggling to get reloads working properly (via ExecReload=/usr/local/bin/einhornsh --execute upgrade). It does stop the old sidekiq process & start a new one, but then after another TimeoutSec delay, systemd decides that it didn't actually restart, prints sidekiq.service: State 'stop-sigterm' timed out. Killing. to journalctl, and SIGKILLs the new sidekiq process before relaunching it. Not entirely sure what's going on there, I'm going to put it down for now.

jdelStrother commented 1 year ago

When einhorn performs the rolling upgrade, it shuts down the old sidekiq process which calls SdNotify.stopping. As far as I can tell, that tells systemd to expect that the service will exit completely and it forgets about the reload. systemctl status shows Active: deactivating (stop-sigterm) since Thu 2023-03-16 15:37:33 UTC; 25s ago.

After TimeoutStopSec, systemd then kills the einhorn process and restarts it:

Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: State 'stop-sigterm' timed out. Killing.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857274 (einhorn) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857275 (bundle) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857327 (agent_thread.r*) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857328 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857331 (Timeout stdlib ) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857333 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857340 (tracer.rb:424) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857341 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857342 (scheduler) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857343 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857344 (default/process) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857346 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857347 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857349 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857351 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857352 (n/a) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 860472 (default/process) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 860857 (default/process) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Main process exited, code=killed, status=9/KILL
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Killing process 857327 (agent_thread.r*) with signal SIGKILL.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Failed with result 'timeout'.
Mar 16 15:36:14 staging1-3 systemd[1]: sidekiq.service: Consumed 9.795s CPU time.
Mar 16 15:36:15 staging1-3 systemd[1]: sidekiq.service: Scheduled restart job, restart counter is at 2.
Mar 16 15:36:15 staging1-3 systemd[1]: Stopped sidekiq.
Mar 16 15:36:15 staging1-3 systemd[1]: sidekiq.service: Consumed 9.795s CPU time.
Mar 16 15:36:15 staging1-3 systemd[1]: Starting sidekiq...

If I comment out sidekiq's SdNotify.stopping notification, I can successfully perform a systemctl reload sidekiq, but that doesn't seem like a great solution. 😕

Possibly einhorn and sdnotify are a bad mix. I'm considering dropping einhorn and going back to old fashioned kill TSTP and a regular systemctl restart sidekiq.

mperham commented 1 year ago

bin/sidekiqswarm is designed to run under Einhorn within systemd. bin/sidekiq assumes it has full control over the process lifecycle. I'd happily accept smarter integration as a PR.

jdelStrother commented 1 year ago

I see the same issue when using sidekiqswarm under einhorn. Here's a reload:

Mar 17 10:38:29 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Starting smooth upgrade from version 0...
Mar 17 10:38:29 staging1-3 sidekiq[1276884]: Starting smooth upgrade from version 0...
Mar 17 10:38:29 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: ===> Launched 1276887 (index: 1)
Mar 17 10:38:29 staging1-3 sidekiq[1276887]: [WORKER 1276887] INFO: About to exec ["/usr/local/bin/bundle", "exec", "sidekiqswarm", "-e", "staging", "-c", "10"]
Mar 17 10:38:29 staging1-3 sidekiq[1276884]: ===> Launched 1276887 (index: 1)
Mar 17 10:38:29 staging1-3 sidekiq[1276884]: ===> Exited state passing process 1276885
Mar 17 10:38:30 staging1-3 sidekiq[1276887]: 2023-03-17T10:38:30.854Z pid=1276887 tid=ragr INFO: Enabling systemd notification integration
Mar 17 10:38:30 staging1-3 sidekiq[1276887]: [swarm] pid=1276887 Preloading application
Mar 17 10:38:30 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Worker 1276887 has been up for 1s, so we are considering it alive.
Mar 17 10:38:30 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Up to 1 / 1 timer ACKs
Mar 17 10:38:30 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Upgraded successfully to version 1 (Einhorn 1.0.0).
Mar 17 10:38:30 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Killing off 1 old workers.
Mar 17 10:38:30 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: Sending USR2 to [1276016]
Mar 17 10:38:30 staging1-3 sidekiq[1276884]: Upgraded successfully to version 1 (Einhorn 1.0.0).
Mar 17 10:38:30 staging1-3 sidekiq[1276884]: Upgrade done
Mar 17 10:38:30 staging1-3 sidekiq[1276096]: 2023-03-17T10:38:30.910Z pid=1276096 tid=rajg INFO: Got USR2, starting graceful shutdown
Mar 17 10:38:30 staging1-3 sidekiq[1276096]: 2023-03-17T10:38:30.911Z pid=1276096 tid=rajg INFO: Terminating quiet threads for default capsule
Mar 17 10:38:30 staging1-3 systemd[1]: Reloaded sidekiq.
Mar 17 10:38:30 staging1-3 sidekiq[1276096]: 2023-03-17T10:38:30.911Z pid=1276096 tid=p9z0 INFO: Scheduler exiting...
Mar 17 10:38:31 staging1-3 sidekiq[1276096]: 2023-03-17T10:38:31.918Z pid=1276096 tid=p9w0 INFO: Graceful shutdown complete, bye!
Mar 17 10:38:32 staging1-3 sidekiq[1276016]: [[--notify--]] STOPPING=1 unset:false /run/systemd/notify
Mar 17 10:38:32 staging1-3 sidekiq[1276887]: 2023-03-17T10:38:32.868Z pid=1276887 tid=ragr INFO: Filesystem mounts are ok!
Mar 17 10:38:32 staging1-3 sidekiq[1276014]: [MASTER 1276014] INFO: ===> Exited worker 1276016
Mar 17 10:38:37 staging1-3 sidekiq[1276887]: [[--notify--]] READY=1 unset:false /run/systemd/notify
Mar 17 10:38:37 staging1-3 sidekiq[1276887]: 2023-03-17T10:38:37.018Z pid=1276887 tid=ragr INFO: Pinging systemd watchdog every 30.0 sec
Mar 17 10:38:37 staging1-3 sidekiq[1276928]: 2023-03-17T10:38:37.017Z pid=1276928 tid=ragr INFO: Booted Rails 7.0.4.2 application in staging environment
Mar 17 10:38:37 staging1-3 sidekiq[1276928]: 2023-03-17T10:38:37.018Z pid=1276928 tid=ragr INFO: Running in ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
Mar 17 10:38:37 staging1-3 sidekiq[1276928]: 2023-03-17T10:38:37.018Z pid=1276928 tid=ragr INFO: Sidekiq Pro 7.0.7 / Sidekiq Enterprise 7.0.5, commercially licensed.

(I've manually added the [[--notify--]] debug lines into sidekiq's SdNotify.notify)

So the reload has successfully shut down the old worker 1276016 and started a new one 1276928.

However, systemctl status shows:

     Active: deactivating (stop-sigterm) since Fri 2023-03-17 10:38:32 UTC; 4min 44s ago

and after TimeoutStopSec, it forcibly restarts einhorn:

Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: State 'stop-sigterm' timed out. Killing.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276014 (einhorn) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276887 (bundle) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276928 (bundle) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276908 (reaper.rb:40) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276910 (agent_thread.r*) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276913 (Timeout stdlib ) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276929 (tracer.rb:424) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276931 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276932 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276933 (tracer.rb:424) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276934 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276935 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276936 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276937 (default/process) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276939 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276940 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276941 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276942 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276943 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276944 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276945 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276951 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1276955 (n/a) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Killing process 1277159 (default/process) with signal SIGKILL.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Main process exited, code=killed, status=9/KILL
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Failed with result 'timeout'.
Mar 17 10:43:32 staging1-3 systemd[1]: sidekiq.service: Consumed 17.020s CPU time.
Mar 17 10:43:33 staging1-3 systemd[1]: sidekiq.service: Scheduled restart job, restart counter is at 2.
Mar 17 10:43:33 staging1-3 systemd[1]: Stopped sidekiq.
Mar 17 10:43:33 staging1-3 systemd[1]: sidekiq.service: Consumed 17.020s CPU time.
Mar 17 10:43:33 staging1-3 systemd[1]: Starting sidekiq...
Mar 17 10:43:34 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: Blowing away old Einhorn command socket at /tmp/einhorn.sock. This likely indicates a previous Einhorn master which exited uncleanly.
Mar 17 10:43:34 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: Writing PID to /tmp/einhorn.pid
Mar 17 10:43:34 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: Launching 1 new workers
Mar 17 10:43:34 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: ===> Launched 1279079 (index: 0)
Mar 17 10:43:34 staging1-3 sidekiq[1279079]: [WORKER 1279079] INFO: About to exec ["/usr/local/bin/bundle", "exec", "sidekiqswarm", "-e", "staging", "-c", "10"]
Mar 17 10:43:34 staging1-3 sidekiq[1279079]: 2023-03-17T10:43:34.930Z pid=1279079 tid=rhmz INFO: Enabling systemd notification integration
Mar 17 10:43:34 staging1-3 sidekiq[1279079]: [swarm] pid=1279079 Preloading application
Mar 17 10:43:35 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: Worker 1279079 has been up for 1s, so we are considering it alive.
Mar 17 10:43:35 staging1-3 sidekiq[1279077]: [MASTER 1279077] INFO: Up to 1 / 1 timer ACKs
Mar 17 10:43:36 staging1-3 sidekiq[1279079]: 2023-03-17T10:43:36.754Z pid=1279079 tid=rhmz INFO: Filesystem mounts are ok!
Mar 17 10:43:40 staging1-3 sidekiq[1279079]: [[--notify--]] READY=1 unset:false /run/systemd/notify
Mar 17 10:43:40 staging1-3 sidekiq[1279079]: 2023-03-17T10:43:40.502Z pid=1279079 tid=rhmz INFO: Pinging systemd watchdog every 30.0 sec
Mar 17 10:43:40 staging1-3 systemd[1]: Started sidekiq.
Mar 17 10:43:40 staging1-3 sidekiq[1279123]: 2023-03-17T10:43:40.502Z pid=1279123 tid=rhmz INFO: Booted Rails 7.0.4.2 application in staging environment
Mar 17 10:43:40 staging1-3 sidekiq[1279123]: 2023-03-17T10:43:40.503Z pid=1279123 tid=rhmz INFO: Running in ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]
Mar 17 10:43:40 staging1-3 sidekiq[1279123]: 2023-03-17T10:43:40.503Z pid=1279123 tid=rhmz INFO: Sidekiq Pro 7.0.7 / Sidekiq Enterprise 7.0.5, commercially licensed.

(As a sidenote, we've been running einhorn+sidekiq with systemd's Type=simple for ages, but the other day we hit what appears to be a rare deadlock in rmagick/imagemagick which hung the entire sidekiq process. I started looking into Type=notify and the watchdog timer as a way of mitigating that, but unless I'm missing something sidekiqswarm's watchdog timer is only going to report that the sidekiqswarm top-level process is still responding, since it's not doing anything to check if its children are locked up. I'm not saying it necessarily ought to, just explaining why we're trying to use sdnotify with a regular single sidekiq instance rather than moving everything to swarm.)

sidekiq / sidekiq

Systemd notify - PID not permitted #5829