seomoz / qless

Queue / Pipeline Management
MIT License
294 stars 76 forks source link

Show when scheduled jobs are scheduled to be run. #60

Open proby opened 11 years ago

proby commented 11 years ago

It'd be super helpful and handy to show when a job is scheduled to be run whenever looking at a scheduled job.

wr0ngway commented 10 years ago

+1 This seem like a hole in the api, having it would make it easier to assert that something got scheduled at a specific time in a test case.

dlecocq commented 10 years ago

It is a hole, but one we can fix. As a note to myself:

StephenOTT commented 10 years ago

:+1:

StephenOTT commented 10 years ago

@dlecocq I just threw this together real quick. Have not tested it yet. But just curious on your thoughts on the style and placement.


-- Get all the attributes of this particular job
function QlessRecurringJob:data()
  local job = redis.call(
    'hmget', 'ql:r:' .. self.jid, 'jid', 'klass', 'state', 'queue',
    'priority', 'interval', 'retries', 'count', 'data', 'tags', 'backlog')

local jobScheduleDate = nil
if job[3] == "scheduled" then
 jobScheduleDate = redis.call(
    'get', 'ql:q:' .. job[4] .. '-scheduled', self.jid )
end

  if not job[1] then
    return nil
  end

  return {
    jid          = job[1],
    klass        = job[2],
    state        = job[3],
    queue        = job[4],
    priority     = tonumber(job[5]),
    interval     = tonumber(job[6]),
    retries      = tonumber(job[7]),
    count        = tonumber(job[8]),
    data         = job[9],
    tags         = cjson.decode(job[10]),
    backlog      = tonumber(job[11] or 0),
    scheduledfor = tonumber(jobScheduleDate)
  }
end

OR

just:

.......
  return {
    jid          = job[1],
    klass        = job[2],
    state        = job[3],
    queue        = job[4],
    priority     = tonumber(job[5]),
    interval     = tonumber(job[6]),
    retries      = tonumber(job[7]),
    count        = tonumber(job[8]),
    data         = job[9],
    tags         = cjson.decode(job[10]),
    backlog      = tonumber(job[11] or 0),
    scheduledfor = tonumber(redis.call(
                        'get', 'ql:q:' .. job[4] .. '-scheduled', self.jid ))
  }

Original set of code was: https://github.com/seomoz/qless-core/blob/521adbe59a6649e01f3349297cfa69e3af4d6f6e/recurring.lua#L1-L24

StephenOTT commented 10 years ago

nvm thats clearly going to need some more work now that i look at it some more.

StephenOTT commented 10 years ago

@dlecocq I have been looking at the data model that is created in Redis for scheduled data.

What was your thinking for placing all scheduled jobs in a single key for a single queue? As I was thinking about the query needed to return the specific ZSET score for the job id value in the scheduled queue, I keep thinking about performance limits for the number of jobs that are scheduled.

The current configuration makes me think the only way to get the specific value (mid) in the ZSET is to return all ZSET items and do an in-memory search. This works for small groups. But just curious about the number of "scheduled" jobs you were imagining would be stored at one time in a single queue? 100s, 1000s, 10,000s, 100,000, 1,000,000s?

I see the reasoning for returning using a ZSET for the auto sorting as a regular key value pair, so just curious about thinking about performance.

Thanks!

StephenOTT commented 10 years ago

My other thought is you could use the fairly new ZSCAN feature:

redis.zscan("ql:q:testing-scheduled", 0, {match: "7f81bbe64bcd4599b565c95c817cf363"})

This works in the current structure. It would be a little slow in the future with large counts. But better than bring back everything.

returned is:

7f81bbe64bcd4599b565c95c817cf363
1400328474.6609
dlecocq commented 10 years ago

Redis sorted sets are fast enough, and conservative benchmarks indicate a possible throughput of about 10-100M put-pops per day on a single Redis instance. Of course, that benchmark was based on a workflow where jobs don't accumulate in huge quantities in a queue. That said, we've also had legitimate instances of 100k-1M jobs in a queue in production relatively comfortably, though that's not our typical use. We thought at length about what the right data structure was for job scheduling, and when balancing job priority, scheduling, etc., a ZSET made the task very easy while still being performant.

As far as determining a jid's score, ZSCORE is O(1) and is used in queue.lua to determine a job's score

StephenOTT commented 10 years ago

Okay I have created a PR for review. See: https://github.com/seomoz/qless/issues/187 https://github.com/seomoz/qless-core/issues/48

stephenreay commented 4 years ago

It seems this hasn't progressed in 5 years, despite needing only minor changes? If I create an updated PR based on this work + discussed changes, can someone get this merged?