Closed hongkongkiwi closed 3 years ago
Those are really s6 questions, since they aren't about service dependencies and ordering, and only involve longruns. They are all addressed at the s6 level - but if you need to add a file to a s6 service directory, the same file will work in a s6-rc source definition directory for the longrun, so the answers work for s6-rc as well.
finish
script that sleeps for 5 seconds. If you need more than 5 seconds, you will also need to extend the authorized running time for finish
, via a timeout-finish
file. The finish
script and timeout-finish
file, as well as other configuration knobs that you may want to use, are documented here.finish
script. finish
runs with two arguments, one of which is the exit code of the program. You can script the behaviour you want there. If you want s6 to fail the service permanently and stop supervising it, simply have your finish
script exit 125.finish
script again.Thanks for the suggestions! I will explore these. That helps a lot.
I didn't realise that finish could be used in this way, and didn't see s6-permafailon before.
I have a couple of questions regarding services that I couldn't figure out.
Is there a way to make s6-rc to have a specific time between restarting services? I'm using an embedded processor and right now if a service fails it tries to restart immediately. I would like to set a delay appropriate to the service before it retries again. This way if lots of services fail at once for some reason, they don't hammer the system. For example, right now I have a internet connection monitoring daemon, if it fails, I'm ok to wait 5s between retries.
Is there a way to differentiate between an error that is unrecoverable and something that is not? For example as the example above, I have an internet connection daemon which might fail because a config file does not exist. I'd prefer to have it only restart if the error code is below or above a certain number .... e.g. if errorcode is < 100 then keep supervising and restart, if it's above 100 then don't supervise.
Is there some kind of failure count mechanism? There are some services I would like to retry 10 times then give up on. Once the service is running, I would like to supervise. So this is a kind of semi-supervision state.
I'm sure there are some clever ways to address my issues above, any ideas on how I could handle these cases?