tobi / delayed_job

Database backed asynchronous priority queue -- Extracted from Shopify
http://tobi.github.com/delayed_job
MIT License
2.15k stars 1.25k forks source link

h1. Delayed::Job

Delayed_job (or DJ) encapsulates the common pattern of asynchronously executing longer tasks in the background.

It is a direct extraction from Shopify where the job table is responsible for a multitude of core tasks. Amongst those tasks are:

h2. Setup

The library evolves around a delayed_jobs table which can be created by using:


  script/generate delayed_job

The created table looks as follows:


  create_table :delayed_jobs, :force => true do |table|
    table.integer  :priority, :default => 0      # Allows some jobs to jump to the front of the queue
    table.integer  :attempts, :default => 0      # Provides for retries, but still fail eventually.
    table.text     :handler                      # YAML-encoded string of the object that will do work
    table.string   :last_error                   # reason for last failure (See Note below)
    table.datetime :run_at                       # When to run. Could be Time.now for immediately, or sometime in the future.
    table.datetime :locked_at                    # Set when a client is working on this object
    table.datetime :failed_at                    # Set when all retries have failed (actually, by default, the record is deleted instead)
    table.string   :locked_by                    # Who is working on this object (if locked)
    table.timestamps
  end

On failure, the job is scheduled again in 5 seconds + N ** 4, where N is the number of retries.

The default @MAX_ATTEMPTS@ is @25@. After this, the job either deleted (default), or left in the database with "failed_at" set. With the default of 25 attempts, the last retry will be 20 days later, with the last interval being almost 100 hours.

The default @MAX_RUN_TIME@ is @4.hours@. If your job takes longer than that, another computer could pick it up. It's up to you to make sure your job doesn't exceed this time. You should set this to the longest time you think the job could take.

By default, it will delete failed jobs (and it always deletes successful jobs). If you want to keep failed jobs, set @Delayed::Job.destroy_failed_jobs = false@. The failed jobs will be marked with non-null failed_at.

Here is an example of changing job parameters in Rails:


  # config/initializers/delayed_job_config.rb
  Delayed::Job.destroy_failed_jobs = false
  silence_warnings do
    Delayed::Job.const_set("MAX_ATTEMPTS", 3)
    Delayed::Job.const_set("MAX_RUN_TIME", 5.minutes)
  end

Note: If your error messages are long, consider changing last_error field to a :text instead of a :string (255 character limit).

h2. Usage

Jobs are simple ruby objects with a method called perform. Any object which responds to perform can be stuffed into the jobs table. Job objects are serialized to yaml so that they can later be resurrected by the job runner.


  class NewsletterJob < Struct.new(:text, :emails)
    def perform
      emails.each { |e| NewsletterMailer.deliver_text_to_email(text, e) }
    end    
  end  

  Delayed::Job.enqueue NewsletterJob.new('lorem ipsum...', Customers.find(:all).collect(&:email))

There is also a second way to get jobs in the queue: send_later.


  BatchImporter.new(Shop.find(1)).send_later(:import_massive_csv, massive_csv)

This will simply create a @Delayed::PerformableMethod@ job in the jobs table which serializes all the parameters you pass to it. There are some special smarts for active record objects which are stored as their text representation and loaded from the database fresh when the job is actually run later.

h2. Running the jobs

You can invoke @rake jobs:work@ which will start working off jobs. You can cancel the rake task with @CTRL-C@.

You can also run by writing a simple @script/job_runner@, and invoking it externally:


  #!/usr/bin/env ruby
  require File.dirname(__FILE__) + '/../config/environment'

  Delayed::Worker.new.start  

Workers can be running on any computer, as long as they have access to the database and their clock is in sync. You can even run multiple workers on per computer, but you must give each one a unique name:


  3.times do |n|
    worker = Delayed::Worker.new
    worker.name = 'worker-' + n.to_s
    worker.start
  end   

Keep in mind that each worker will check the database at least every 5 seconds.

Note: The rake task will exit if the database has any network connectivity problems.

h3. Cleaning up

You can invoke @rake jobs:clear@ to delete all jobs in the queue.

h3. Changes