lholden / job_scheduler

A simple cron-like job scheduling library for Rust.
Apache License 2.0
196 stars 34 forks source link

Ability to remove job from schedule #4

Closed Boscop closed 4 years ago

Boscop commented 6 years ago

Please add the ability to remove a job from the schedule. JobScheduler::add() could return a JobId and then a job can be removed via its JobId.

lholden commented 6 years ago

I like the idea and it shouldn't be at all hard to implement. I will happily take a pull request implementing a JobId :). Otherwise, I will likely look into doing so in the near future.

ghost commented 6 years ago

I'd love to have a go at implementing this if possible? Do you see the structure internally changing to a hash or just searching the vec for the job with the id?

lholden commented 6 years ago

Please feel free to give it a go @temius :).

I am indifferent as to hash/vec, please feel free to experiment.

Boscop commented 6 years ago

I'd keep the Vec because job removal will occur rarely and even then a Vec has faster lookup than a HashMap for even not so small N because of cache efficiency. And for small N, (the number of jobs is usually such a small N) linear search is often faster than binary search, also because of cache prefetching.

ghost commented 6 years ago

I get what you mean about the lookup @Boscop, but we won't be able to use the index at which the job is stored in the vector because removing elements will invalidate the id, since removing on a vector shifts the values down.

There's two potential solutions to this:

Or are you envisioning this another way? Would be great to hear your thoughts, but I feel these are unnecessarily complicated when we could be using a hashmap for readability?

lholden commented 6 years ago

@temius Some things to consider.

  1. The standard use case for the Job collection is to be iterated over.
  2. Index and identity do not need to be the same thing.
  3. The size of the collection is likely to remain reasonably small. (Iterating to remove a Job shouldn't be a huge problem)

With these things in mind, think about how Vec and HashMap fit into this picture and the strengths of each collection. For example, while HashMap is certainly very useful for situations where lookup is expensive or a primary use case, Vec is optimized for iteration.

Also consider that it's perfectly reasonable to provide the Job an identity not based on its index within the collection it's stored in.

Two examples of such.

1) JobScheduler could assign the identity. In this case, identity could be assigned on add. I can think of a couple ways to do this off the top of my head:

pub fn add(&mut self, job: Job<'a>) {
    self.jobs.push((self.next_identity(), job))
}
pub fn add(&mut self, mut job: Job<'a>) {
    job.id = Some(self.next_identity());
    self.jobs.push(job)
}

2) The Job could assign it's own identity by generating a UUID or by using a global unique identity generator. You would want to be able to do this in a manner that doesn't change the existing interface Job provides.


Please feel free to experiment and come up with ideas! I obviously have an idea in mind for how I would implement this, but I'd like to encourage you to come up with your own solution. (Because that's the fun part!)

ghost commented 6 years ago

@lholden Sure I get that, was thinking more in line with an indexmap initially.

Although sticking with vectors I think the second code example you provided was more in line with what I was trying to get at.

I briefly considered UUID but thought it might be a bit much for this.

I'll push something up for you to look at.

lholden commented 6 years ago

IndexMap is also perfectly reasonable. it's just an abstraction over the first example I provided. :)

lholden commented 6 years ago

@temius Did you make any progress on this? Otherwise I am likely going to work on it :)

ghost commented 6 years ago

Unfortunately work took over! So go for it :)

Boscop commented 6 years ago

We could just store a Vec<(JobId, JobHandle)> (with JobId = usize) and iterate for lookup.