This is related to the usage of runit cookbook as part of Chef Server installation (embedded cookbooks in /opt/opscode/). I am trying to upgrade my chef server and during migration to 1.28 the upgrade process fails, because private-chef-upgrade is unable to stop nginx service:
[private-chef-upgrade] - Current Migration Version: 1.27
[private-chef-upgrade] - Starting Migration 1.28
[private-chef-upgrade] - Stopping Services ["nginx", "opscode-erchef"]
ok: down: nginx: 24s, normally up
down: nginx: 25s, normally up; run: log: (pid 17822) 28940s
[private-chef-upgrade] - Error: service nginx failed to stop, killing gracefully...
could not find nginx runit pidfile (service already stopped?), cannot attempt SIGKILL...
ok: down: nginx: 27s, normally up
down: nginx: 28s, normally up; run: log: (pid 17822) 28943s
[private-chef-upgrade] - Failure: service nginx could not be stopped or killed.
The reason for this is service_manager checking exit code of /opt/opscode/init/nginx status, and that script always returns with exit code 0 even if the service is stopped. Note, /opt/opscode/init/nginx is created by this cookbook, which is why I am filing this issue here.
The root cause of the issue seems to be the init file calling sv binary directly. However, the sv binary only exits with non-zero code when it is called with a base name other than sv (more details here, and it's also documented in sv(8)).
Some of the possible ways this could be fixed are:
hard-linking sv binary to a different name which will be used in init-scripts;
parsing sv status output and using it to determine exit code.
I am happy to send a PR if any of the maintainers would advise on the preferred solution.
I believe we fixed this by not just linking but creating an init file on Debian. Give it a try on the latest release and feel free to open this back up if it's still an issue.
Cookbook version
1.6.0
Chef-client version
12.12.19
Platform Details
Debian 8.5
Scenario:
This is related to the usage of
runit
cookbook as part of Chef Server installation (embedded cookbooks in/opt/opscode/
). I am trying to upgrade my chef server and during migration to 1.28 the upgrade process fails, because private-chef-upgrade is unable to stopnginx
service:The reason for this is service_manager checking exit code of
/opt/opscode/init/nginx status
, and that script always returns with exit code 0 even if the service is stopped. Note,/opt/opscode/init/nginx
is created by this cookbook, which is why I am filing this issue here.The root cause of the issue seems to be the init file calling
sv
binary directly. However, thesv
binary only exits with non-zero code when it is called with a base name other than sv (more details here, and it's also documented insv(8)
).Some of the possible ways this could be fixed are:
sv
binary to a different name which will be used in init-scripts;sv status
output and using it to determine exit code.I am happy to send a PR if any of the maintainers would advise on the preferred solution.
Steps to Reproduce:
Expected Result:
When service is stopped,
status
should return non-zero exit code.Actual Result:
status
returns zero exit code irrespective of service status.