chef-cookbooks / runit

Development repository for the Chef Runit Cookbook
https://supermarket.chef.io/cookbooks/runit
Apache License 2.0
106 stars 197 forks source link

init script: 'status' always returns 0 #207

Closed knyar closed 6 years ago

knyar commented 7 years ago

Cookbook version

1.6.0

Chef-client version

12.12.19

Platform Details

Debian 8.5

Scenario:

This is related to the usage of runit cookbook as part of Chef Server installation (embedded cookbooks in /opt/opscode/). I am trying to upgrade my chef server and during migration to 1.28 the upgrade process fails, because private-chef-upgrade is unable to stop nginx service:

[private-chef-upgrade] - Current Migration Version: 1.27
[private-chef-upgrade] - Starting Migration 1.28
[private-chef-upgrade] -        Stopping Services ["nginx", "opscode-erchef"]
ok: down: nginx: 24s, normally up
down: nginx: 25s, normally up; run: log: (pid 17822) 28940s
[private-chef-upgrade] - Error: service nginx failed to stop, killing gracefully...
could not find nginx runit pidfile (service already stopped?), cannot attempt SIGKILL...
ok: down: nginx: 27s, normally up
down: nginx: 28s, normally up; run: log: (pid 17822) 28943s
[private-chef-upgrade] - Failure: service nginx could not be stopped or killed.

The reason for this is service_manager checking exit code of /opt/opscode/init/nginx status, and that script always returns with exit code 0 even if the service is stopped. Note, /opt/opscode/init/nginx is created by this cookbook, which is why I am filing this issue here.

The root cause of the issue seems to be the init file calling sv binary directly. However, the sv binary only exits with non-zero code when it is called with a base name other than sv (more details here, and it's also documented in sv(8)).

Some of the possible ways this could be fixed are:

I am happy to send a PR if any of the maintainers would advise on the preferred solution.

Steps to Reproduce:

# SVDIR=/opt/opscode/service /opt/opscode/init/nginx status; echo $?
run: nginx: (pid 12945) 262s; run: log: (pid 17822) 28747s
# SVDIR=/opt/opscode/service /opt/opscode/init/nginx stop
ok: down: nginx: 0s, normally up
# SVDIR=/opt/opscode/service /opt/opscode/init/nginx status; echo $?
down: nginx: 1s, normally up; run: log: (pid 17822) 28753s
0

Expected Result:

When service is stopped, status should return non-zero exit code.

Actual Result:

status returns zero exit code irrespective of service status.

tas50 commented 6 years ago

I believe we fixed this by not just linking but creating an init file on Debian. Give it a try on the latest release and feel free to open this back up if it's still an issue.