vsivsi / meteor-job-collection

A persistent and reactive job queue for Meteor, supporting distributed workers that can run anywhere.
https://atmospherejs.com/vsivsi/job-collection
Other
385 stars 68 forks source link

Why does cancelRepeats only works with repeats forever? #190

Closed luixal closed 8 years ago

luixal commented 8 years ago

Hi,

I'm having some doubts about the cancelRepeats options. Is there any specific "why" about only using it in "repeats forever" jobs?

job.save(
  {
    // Cancel any jobs of the same type,
    // but only if this job repeats forever.
    // Default: false.
    cancelRepeats: true
  }
);

In my case, we have some jobs that are scheduled for an specific time, runs once and when finishing, it schedules the next execution (timing can change heavily at any moment). So, the best approach would be to use the cancelRepeats option when the server (re)starts.

Right now we got it covered (we're removing those previous schedules) so, no hurry, we're just looking around for a better approach.

Thanks!

vsivsi commented 8 years ago

The cancelRepeats functionality was added as a convenience to easily handle a very common case: Infinitely repeating housekeeping jobs. You want to easily schedule such jobs on each server restart (to ensure that there is always one scheduled) but most of the time, there is already one scheduled from the last running server instance.

As for the question:

Why does cancelRepeats only works with repeats forever?

The answer is pretty much: because I didn't anticipate your use case. What you describe is essentially an infinitely repeating job, but with variable repeat timing. It sounds like the addition of a delayReps option to job.done() (analogous to the existing delayDeps option) would be a good solution for you. I'll think about it. https://github.com/vsivsi/meteor-job-collection#jobdoneresult-options-callback---anywhere

luixal commented 8 years ago

Actually, what we are doing is scheduling the next job (same type of job, different job itself) at the end of the execution of the current one.

I think is pretty much the same approach implemented on a different way.

Thanks!

vsivsi commented 8 years ago

Actually, what we are doing is scheduling the next job (same type of job, different job itself) at the end of the execution of the current one.

This is precisely how repeating jobs are handled internally. Except they are restricted to a fixed delay between runs. A delayReps option to job.done()would make that variable, and then the cancelRepeats option would work as expected for that case, because it would truly be cancelling jobs of the same type as a repeating job.

luixal commented 8 years ago

Hi,

When working with a timing so much variable like the one we have, it seems more complicated to work with the delayReps than simply just schedule another different job.

Eventually, we'll try your approach, we have some code which is job-related and some future refactoring.

Thanks!

vsivsi commented 8 years ago

To be clear, delayReps doesn't exist yet. I was just floating it as an idea. Effectively it is just a shortcut to scheduling a different job, with the added assumption that there will always be another job (perhaps that doesn't hold in your case...)

luixal commented 8 years ago

Yep, I was thinking it was something already implemented that I had missed!

In my case, there will always be another job (but it's true, maybe someone would be in that case). I'll be a little more specific (in case it helps you to implement some future feature):

I have an Employees collection. I also have a Shifts collection. Both collections are related, I have a bunch of employeesand each one of them has a shift. This is: "employee 1 works from 08:00 to 14:00 and employee 2 works from 15:00 to 20:00". I also have a control pressence system which gives me the exact time employee 1 signed in.

Now, the job related part: I want to be notified when an employee arrives late to work. I have a configurable delay, this is, "I want to be notified when it's been X minutes and employee 1 hasn't signed in". The thing is that I have many employees and MANY different shifts.

How do I handle it now? When server starts up, I schedule a job for the very next shift assigned to an employee, plus that configurable delay. The job runs (selecting every employee in a similar shift, they can be shared) and sends the required notifications. After that, before finishing, it schedules a different job (same job type) for the very next shift plus the configurable delay.

Maybe it's a bit complicated to explain it this way, but if you want to know more, no problem on my side to get along with this matter and making some beta testing if it comes to it :)

Thanks!

vsivsi commented 8 years ago

Good info, thanks. So if I understand, it sounds like delayReps would work for your situation...

Today:


// In worker function...

  var delay = calcDelay(data.employee, data.shift);
  job.done(result, function (err, res) {
      // handle errors
      var next = new Job('jobType', data).delay(delay).save(function (err, res) {
         // handle errors
         cb();
      });
  });
  return;
}

With delayReps:

// In worker function...

  var delay = calcDelay(data.employee, data.shift);
  job.done(result, { delayReps: delay }, function (err, res) {
     // handle errors
     cb();
  });
  return;
}

This assumes the original jobs are scheduled at server start like:

  var delay = calcDelay(data.employee, data.shift);
  var job = new Job('jobType', data)
      .delay(delay)
      .repeat({ repeats: jc.forever })
      .save({ cancelRepeats: true }, function (err, res) {
         // handle errors
      });

This would save having to explicitly schedule your own "next" jobs, as well as enabling the use of cancelRepeats, so you don't need to do your own server startup housekeeping.

Does that seem about right?

luixal commented 8 years ago

Yes, that would be a great solution :)

I have to say: we don't use delays. We schedule jobs for an specific date and time, would be nice to have both options (passing an specific datetime or a delay), but If I'd had to choose one, I will implement it using a delay (seems to cover more cases).

Re-thinking about all of this... this job runs for about 30 times a day right now and that number is gonna rise. What would you think about accepting an array of dates? This is, instead of scheduling a job one date, passing an array of dates for scheduling the job to run on several dates. A callback that gets run when last execution is done would be mandatory here.

This would bring up other issues (i.e: a new shift gets added, but that can be handled with a collection hook) but would be quite rare in my scenario.