Closed Roguelazer closed 13 years ago
Probably need to break this out and have a larger "how should services work" wiki page.
What does lock_host
mean ?
I would vote for somehow combining the 'respawn' option into just one. Perhaps by default it always immediately respawns. You can then add respawn_limit
to limit the number before giving up. Or respawn_backoff
to control how many times to respawn before slowing down how fast we try to respawn.
lock_host
is a boolean as to whether all copies should be on the same host or on tron should be allowed to spread them out over multiple hosts.
The respawn changes seem reasonable.
For splitting instances up over nodes... your options would be:
count
instances evenlycount
on each nodecount
on whichever node was chosen first ?If the default behavior was ##1 (split evenly) then the other 2 could be configured by adjusting other parameters with no other loss right ? You could just change the value for count
and/or configure it on less nodes.
Closing this for now. Services support mostly works.
It would be most excellent if tron had better support for managing services. Rather than just having to start them as non-daemonizing jobs that last forever, or writing custom client-side code to handle checking daemon status and running that regularly, tron should be able to handle services directly. Here's the configuration format I had in mind:
Translation of this: a service named
worker_daemon
should be instanted 5 times, not necessarily on the same host. The PID will be put in /var/run/worker_daemon_000001.pid, /var/run/worker_daemon_000002.pid, et cetera. The command will actually be run asEvery minute, tron will connect to the relevant host and check for the running process (by running kill -1 on the contents of /var/run/worker_daemon_000001.pid, checking /proc/$PID/cmdline, or somesuch). If it isn't running, tron will attempt to respawn it. After failing to respawn it 3 times in a row, tron will mark it as disabled (and possibly send some sort of notification).