Open Saurabh2004in opened 8 years ago
I'm seeing this as well.
I get this issue too. And all chronos cant restart.
The reason is chronos allows job with run_interval equal 0 to be created, eg.
"schedule":"R0/2015-08-28T14:04:54.000+0800/PT0M"
But the exception would be triggered when reload jobs from zookeeper, such as restart.
I delete the jobs with such config and restart successfully.
@gongaiguo how do you delete these jobs without chronos started?
I set the else part to zero , we don't need to skip time if interval is zero.
Same patch is applied in#692
@xtazz I deleted them from zookeeper.
Hi,
I met the problem when doing HA tests. When chronos restarts it reloads jobs stored in Zookeeper (job was { "schedule": "R//P", "name": "create-volume-flocker-demo", "command"...}, ) and fails.
I applied the fix proposed at https://github.com/mesos/chronos/pull/692 and now Chronos loops infinitely : [2016-07-20 09:21:49,968] INFO Calling next for stream: R/2016-07-18T09:38:44.236Z/PT0S, jobname: create-volume-flocker-demo (org.apache.mesos.chrono\ s.scheduler.jobs.JobScheduler:509) [2016-07-20 09:21:49,968] INFO JobNotificationObserver does not handle JobSkipped(ScheduleBasedJob(R/2016-07-18T09:38:44.236Z/PT0S,create-volume-floc\ ker-demo,docker volume create -d flocker --name apache_vol_2_staging -o size=45GB,PT60S,0,0,,,,2,,,,,,false,0.1,256.0,128.0,false,0,ListBuffer(),List\ Buffer(),false,root,null,,ListBuffer(),true,ListBuffer(),false,false,ListBuffer()),2016-07-18T09:38:44.236Z) (org.apache.mesos.chronos.scheduler.jobs\ .JobsObserver$:27) [2016-07-20 09:21:49,968] INFO JobStats does not handle JobSkipped(ScheduleBasedJob(R/2016-07-18T09:38:44.236Z/PT0S,create-volume-flocker-demo,docker\ volume create -d flocker --name apache_vol_2_staging -o size=45GB,PT60S,0,0,,,,2,,,,,,false,0.1,256.0,128.0,false,0,ListBuffer(),ListBuffer(),false,\ root,null,,ListBuffer(),true,ListBuffer(),false,false,ListBuffer()),2016-07-18T09:38:44.236Z) (org.apache.mesos.chronos.scheduler.jobs.JobsObserver$:\ 27) [2016-07-20 09:21:49,968] INFO tail: R/2016-07-18T09:38:44.236Z/PT0S now: 2016-07-20T09:21:48.145Z (org.apache.mesos.chronos.scheduler.jobs.JobSchedu\ ler:563)
and it restarts for same job.
[2016-07-20 09:21:49,968] INFO Calling next for stream: R/2016-07-18T09:38:44.236Z/PT0S, jobname: create-volume-flocker-demo (org.apache.mesos.chrono\ s.scheduler.jobs.JobScheduler:509)
Do I need another fix ? Does the proposed fix at https://github.com/mesos/chronos/pull/692 prevent from storing corrupted data in Zookeeper ? Are my data corrupted in Zookeeper and should I erase them ?
Hi,
I am getting below exception, Just curious to know what causing this issue/
[2016-01-07 13:48:02,153] INFO Loading jobs (org.apache.mesos.chronos.scheduler.jobs.JobScheduler:601)
[2016-01-07 13:48:02,240] INFO Registering jobs:55 (org.apache.mesos.chronos.scheduler.jobs.JobUtils$:74)
[2016-01-07 13:48:02,259] ERROR Loading tasks or jobs failed. Exiting. (org.apache.mesos.chronos.scheduler.jobs.JobScheduler:605)
java.lang.ArithmeticException: / by zero
It looks like calculateSlips on JobUtils.scala is throwing exception. Just want to make sure its a chronos bug or someting related to cron expression causing this.
/**
Calculates the number of skips needed to bring the job start into the future
*/
protected def calculateSkips(dateTime: DateTime, jobStart: DateTime, period: Period): Int = {
// If the period is at least a month, we have to actually add the period to the date
// until it's in the future because a month-long period might have different seconds
if (period.getMonths >= 1) {
} else {
}
}