Open osnyx opened 4 weeks ago
This issue has been mentioned on NixOS Discourse. There might be relevant details there:
https://discourse.nixos.org/t/reload-triggers-now-on-unstable-call-for-migration/17815/2
We might get around though with referencing a static symlink to the actual binary in ExecStart, then the question remains on how to change this symlink after package update. This cannot be done in an ExecStartPre (because unit file changes when store path changes`, but maybe in a separate oneshot service that runs before?
This can be done by referencing /run/current-system which is already being replaced by the activation script
Thanks for the idea with /run/current-system
, that's a possibility indeed. Although this requires nginx
to be in environment.systemPackages
, while it is not really the idea of NixOS that you need to put service binaries into the global PATH. "Fortunately" nginx binary reloading requires stable paths to the executables anyways, so we're already creating symlinks to the binaries somewhere.
Judging based on this recommendation, the approach for tackling this restarts vs. reloads issue appears to be trying to avoid all changes to the unit file in reload cases, e.g. by introducing stable points of indirection (loosely coupled systemd units, stable symlink entry points).
Although this requires
nginx
to be inenvironment.systemPackages
It wouldn't, we don't need it to be in any global path. While the global path also behaves like that, what we actually need is any path that is only changed when the executable changes. We could just let systemd-tmpfilesd symlink it to /run/nginx/executable
for instance.
If I manage to make nginx cleanly reload for package updates and config (as well as certificate) changes, but restart for all other cases, is there any reason to keep enableReload
configurable? So are there any plausible cases where nginx is supposed to restart in these cases as well?
Wrt the dependency on the acme services, perhaps you could factor out the dependencies on the acme-finished.target
s so that it is on the reload service (nginx-config-reload.service) rather than the main service? I would be curious if removing this line causes the acme test suite to fail or not. If it doesn't, that might be an easy win, but I probably had my reasons for adding it ;)
I just thought of a nasty edge case with the ACME certs which makes it necessary for nginx to restart when new SSL vhosts are added. Here's the scenario:
The reload fails because the selfsigned certs are not guaranteed to exist when nginx tries to read them after a config reload. Due to systemd + config switch script limitations, we have no way of delaying the reload until self signed completes. We can guarantee this in the case of a restart as we do today using systemd unit dependencies + ordering.
Note however, this may not be as bad as it sounds for nginx. If a wildcard certificate is used, then adding a new vhost with useACMEHost
specified appropriately, the unit dependencies don't get changed (assuming one already existed configured as such), and thus a reload is safe.
nginx is often used at the front of a web application stack as the single component handling incoming and further processing requests. Hence, restarts of nginx can interrupt the availability of the whole web application and should be avoided wherever acceptable. Instead, reloading can be used for most config changes and even for package upgrades.
The
reloadIfChanged
option is deprecated and frowned upon nowadays, insteadreloadTriggers
are recommended. In his Discourse post on reloadTriggers, @dasJ mentions as a motivating factor:Unfortunately, some of the changes that can be applied to nginx with a reload also cause unit changes. So the main challenge is: for some unit changes we want the service to be restarted, for some we want a reload instead.
cases for reloading nginx (that cause restarts as of now)
config changes: When the nginx config changes, nginx supports reloading its config without interruption. This is already exposed as
services.nginx.enableReload
, but still causes restarts in at least one common case: Adding new vhosts with acme certificates still causesnginx.service
to restart. Why? Adding or removing an acme certificate adds or removes the correspondingacme-*.service
units from the dependencies of thenginx.service
unit. => Unit file changes => service is restarted during switch-to-configuration.package upgrades: Nginx is able to replace its binary doing runtime by replacing the master process and switching over its workers. We have implemented this for our nginx module fork, but as the implementation relies on
reloadIfChanged
we cannot immediately upstream this.The challenge here, again, is that a changed
ExecStart
path causes the unit file to change and, consequently, the service to be restarted. We might get around though with referencing a static symlink to the actual binary inExecStart
, then the question remains on how to change this symlink after package update. This cannot be done in anExecStartPre
(because unit file changes when store path changes`, but maybe in a separate oneshot service that runs before?cases where nginx restarts are still necessary
Unfortunately, the need to restart the service is again derived from a changed unit file.
implementation ideas
reloadTriggers
in a way that prevents the service from restarting when properties apart fromX-ReloadIfChanged
have changed.nginx.service
altogether by settingsystemd.service.nginx.restartIfChanged = false
and trigger service restarts and reloads from an independent service that is able to do a more nuanced analysis of unit file changes. This is slightly inspired by nginx-config-reload-service, possibly we can even fold this functionality into that existing service?If there's an obvious and proper way of doing this, I'm glad to learn about it. If not, the crude workarounds for building this might signal a need for such a mechanism in the NixOS service management logic. Or if the workaround turns out to be rather harmless, we might decide that this is a rare requirement for a service and accept case-by-case workarounds.
I'll be at NixCon 2024 (Berlin) and am happy to discuss this further. :)
Notify maintainers
@dasJ for the
reloadTriggers
nginx maintainers: @fpletz @ajs124 @RaitoBezarius acme team: @aanderse @arianvp @emilazy @floki @andrew-d @m1cr0man (because of the acme service dependencies)Add a :+1: reaction to issues you find important.