Open AntonMalyshev opened 9 years ago
trigger :flapping, :times => 10, :within => 1.minute, :retry_in => 1.hour
?Eye.application "sample/applicationname" do
stop_on_delete true
group 'groupname' do
process :process1 do
self.daemonize true
self.pid_file "/tmp/process1.pid"
self.start_command "process1"
trigger :flapping, :times => 10, :within => 1.minute, :retry_in => 1.hour
end
process :process2 do
self.daemonize true
self.pid_file "/tmp/process2.pid"
self.start_command "process2"
depend_on :process1
trigger :flapping, :times => 10, :within => 1.minute, :retry_in => 1.hour
end
end
end
Process1 is constantly crashing and after 10 crashes flapping is triggered, eye info
returns:
sample/applicationname
groupname
process1 ...................... unmonitored (flapping at 07 Sep 16:37)
process2 ...................... unmonitored (monitor by user at 07 Sep 16:37)
But in a minute dependency of process2 is triggered and eye tries to start process1 again:
sample/applicationname
groupname
process1 ...................... starting
process2 ...................... starting
Log:
07.09.2015 16:37:54 ERROR -- [sample/applicationname:groupname:process1] NOTIFY: flapping!
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] schedule :unmonitor (reason: flapping)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] schedule :check_crash (reason: crashed)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] <= restore
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] => check_crash (reason: crashed)
07.09.2015 16:37:54 WARN -- [sample/applicationname:groupname:process1] check crashed: process is down
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] schedule :restore (reason: crashed)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] <= check_crash
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] => unmonitor (reason: flapping)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] switch :unmonitoring [:down => :unmonitored] (reason: flapping)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] <= unmonitor
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] => check_crash (reason: crashed)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] <= check_crash
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] => restore (reason: crashed)
07.09.2015 16:37:54 INFO -- [sample/applicationname:groupname:process1] <= restore
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process2] schedule :start (reason: wait_dependency)
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process2] => start (reason: wait_dependency)
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process2] pid_file not found, starting...
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process2] switch :starting [:unmonitored => :starting] (reason: wait_dependency)
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process1] schedule :start (reason: start_dependency)
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process1] => start (reason: start_dependency)
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process1] pid_file not found, starting...
07.09.2015 16:38:39 INFO -- [sample/applicationname:groupname:process1] switch :starting [:unmonitored => :starting] (reason: start_dependency)```
Hello! Could you advice how to setup "flapping" of process to delay its startup for 1 hour if it was crashed 10+ times in 1 minute? Also it seems that flapping settings are not applied if the process is starting as a dependency of another process. How can we fix it?