= Active Job Style Guide :idprefix: :idseparator: - :sectanchors: :sectlinks: :toc: preamble :toclevels: 1 ifndef::backend-pdf[] :toc-title: pass:[
This style guide is a list of best practices working with Ruby background jobs using Active Job with Sidekiq backend.
Despite the common belief, they work quite well together if you follow the guidelines.
Sidekiq may be used without Active Job, but the latter adds transparency and a useful serialization layer.
This style guide didn't appear out of thin air - it is based on the professional experience of the editors, official documentation, and suggestions from members of the Ruby community.
Those guidelines help to avoid numerous pitfalls. Depending on the usage of background jobs, some guidelines might apply, and some not.
ifdef::env-github[] You can generate a PDF copy of this guide using https://asciidoctor.org/docs/asciidoctor-pdf/[AsciiDoctor PDF], and an HTML copy https://asciidoctor.org/docs/convert-documents/#converting-a-document-to-html[with] https://asciidoctor.org/#installation[AsciiDoctor] using the following commands:
asciidoctor-pdf -a allow-uri-read README.adoc
Install the rouge
gem to get nice syntax highlighting in the generated document.
==== endif::[]
[#general] == General Recommendations
[#active-record-models-as-arguments] === Active Record Models as Arguments
Pass Active Record models as arguments; do not pass by id. Active Job automatically serializes and deserializes Active Record models using https://edgeguides.rubyonrails.org/active_job_basics.html#globalid[GlobalID], and manual deserialization of the models is not necessary.
GlobalID handles model class mismatches properly.
Deserialization errors are reported to error tracking.
class SomeJob < ApplicationJob def perform(model_id) model = Model.find(model_id) do_something_with(model) end end
class SomeJob < ApplicationJob def perform(model_id) Model.find(model_id)
end end
SomeJob.perform_later(user.id)
class SomeJob < ApplicationJob def perform(model_id) model = Model.find(model_id) do_something_with(model) rescue ActiveRecord::RecordNotFound Rollbar.warning('Not found') end end
WARNING: Do not replace one style with another, use a transitional period to let all jobs scheduled with ids to be processed. Use a helper to temporarily support both numeric and GlobalID arguments.
class SomeJob < ApplicationJob include TransitionHelper
def perform(model)
model = fetch(model, Model)
do_something_with(model)
end end
[#queue-assignments] === Queue Assignments
Explicitly specify a queue to be used in job classes. Make sure the queue is on the https://github.com/mperham/sidekiq/wiki/Advanced-Options#queues[list of processed queues].
Putting all jobs into one basket comes with a risk of more urgent jobs being executed with a significant delay. Do not put slow and fast jobs together in one queue. Do not put urgent and non-urgent jobs together in one queue.
class SomeJob < ApplicationJob def perform
end end
class SomeJob < ApplicationJob queue_as :hgh_prioriti # nonexistent queue specified
def perform
end end
class SomeJob < ApplicationJob queue_as :high_priority
def perform
[#idempotency] === Idempotency
Ideally, jobs should be idempotent, meaning there should be no bad side effects of them running more than once. Sidekiq only guarantees that the jobs will run https://github.com/mperham/sidekiq/wiki/Best-Practices#2-make-your-job-idempotent-and-transactional[at least once], but not necessarily exactly once.
Even jobs that do not fail due to errors https://github.com/mperham/sidekiq/wiki/FAQ#what-happens-to-long-running-jobs-when-sidekiq-restarts[might be interrupted] during https://github.com/mperham/sidekiq/wiki/Deployment#overview[non-rolling-release deployments].
[#atomicity] === Atomicity
During deployment, a job is given 25 seconds to complete by default. After that, the worker is terminated and the job is sent back to the queue. This might result in part of the work being executed twice.
Make the jobs atomic, i.e., all or nothing.
[#threads] === Threads
Do not use threads in your jobs. Spawn jobs instead. Spinning up a thread in a job leads to opening a new database connection, and the connections are easily exhausted, up to the point when the webserver is down.
class SomeJob < ApplicationJob def perform User.find_each |user| Thread.new do ExternalService.update(user) end end end end
class SomeJob < ApplicationJob def perform(user) ExternalService.update(user) end end
[#retries] === Retries
Avoid using https://edgeguides.rubyonrails.org/active_job_basics.html#exceptions[ActiveJob's built-in retry_on
] or ActiveJob::Retry
(activejob-retry
gem).
Use Sidekiq retries, which are also available from within Active Job with Sidekiq 6+.
Do not hide or extract job retry mechanisms. Keep retries directives visible in the jobs.
class SomeJob < ApplicationJob retry_on ThirdParty::Api::Errors::SomeError, wait: 1.minute, attempts: 3
def perform(user)
end end
class SomeJob < ApplicationJob include ReliableJob
def perform(user)
end end
class SomeJob < ApplicationJob sidekiq_options retry: 3
def perform(user)
==== Batches
Always use retries for jobs that are executed in batches, otherwise, the batch will never succeed.
[#use-retries] === Use Retries
Use the retry mechanism. Do not let jobs end up in Dead Jobs. Let Sidekiq retry the jobs, and don't spend time re-running the jobs manually.
[#mind-transactions] === Mind Transactions
Background processing of a scheduled job may happen sooner than you expect. Make sure to https://github.com/mperham/sidekiq/wiki/Problems-and-Troubleshooting#cannot-find-modelname-with-id12345[only schedule jobs when the transaction has been committed].
User.transaction do users_params.each do |user_params| user = User.create!(user_params) NotifyUserJob.perform_later(user) end end
[#local-performance-testing] === Local Performance Testing
Due to Rails auto-reloading, Sidekiq jobs are executed one-by-one, with no parallelism. That may be confusing.
Run Sidekiq in an environment that has eager_load
set to true
, or with the following flags to circumvent this behavior:
[#critical-jobs] === Critical Jobs
Background job processing may be down for a prolonged period (minutes), e.g. during a failed deployment or a burst of other jobs.
Consider running time-critical and mission-critical jobs in-process.
[#business-logic-in-jobs] === Business Logic in Jobs
Do not put business logic to jobs; extract it.
class SendUserAgreementJob < ApplicationJob
def self.perform_later_if_applies(user) job = new(user) return unless job.satisfy_preconditions?
job.enqueue
end
def perform(user) @user = user return unless satisfy_preconditions?
agreement = agreement_for(user: user)
AgreementMailer.deliver_now(agreement)
end
def satisfy_preconditions? legal_agreement_signed? && !user.removed? && !user.referral? && !(user.active? || user.pending?) && !user.has_flag?(:on_hold) end
private
attr_reader :user
end
class SendUserAgreementJob < ApplicationJob def perform(user) agreement = agreement_for(user: user) AgreementMailer.deliver_now(agreement) end end
[#scheduling-a-job-from-a-job] === Scheduling a Job from a Job
Weigh the pros and cons in each case, whether to schedule jobs from jobs or to execute them in-process. Factors to consider: Is it a retriable job? Can inner jobs fail? Are they idempotent? Is there anything in the host job that may fail?
class SomeJob < ApplicationJob def perform SomeMailer.some_notification.deliver_later OtherJob.perform_later end end
OtherJob
fails, SomeMailer
will be re-executed on retry as well==== Numerous Jobs
When a lot of jobs should be performed, it's acceptable to schedule them.
Consider using batches for improved traceability.
Also, specify the same queue for the host job and sub-jobs.
[#job-renaming] === Job Renaming
Carefully rename job classes to avoid situations with jobs are scheduled, but there's no class to process it.
NOTE: This also relates to mailers used with deliver_later
.
[#sleep]
=== sleep
Do not use Kernel.sleep
in jobs.
sleep
blocks the worker thread, and it's not able to process other jobs.
Re-schedule the job for a later time, or use limiters with a custom exception.
class SomeJob < ApplicationJob def perform(user) attempts_number = 3 ThirdParty::Api::User.renew(user.external_id) rescue ThirdParty::Api::Errors::TooManyRequestsError => error sleep(error.retry_after) attempts_number -= 1 retry unless attempts_number.zero? raise end end
class SomeJob < ApplicationJob sidekiq_options retry: 3 sidekiq_retry_in do |count, exception| case exception when ThirdParty::Api::Errors::TooManyRequestsError count + 1 # i.e. 1s, 2s, 3s end end
def perform(user) ThirdParty::Api::User.renew(user.external_id) end end
class SomeJob < ApplicationJob def perform(user) LIMITER.within_limit do ThirdParty::Api::User.renew(user.external_id) end end end
[#infrastructure] == Infrastructure
[#one-process-per-core] === One Process per Core
On multi-core machines, run as many Sidekiq processes as needed to fully utilize cores. Sidekiq process only uses one CPU core. A rule of thumb is to run as many processes as there are cores available.
[#redis-memory-constraints] === Redis Memory Constraints
Redis's database size is limited by server memory.
Some prefer to explicitly set maxmemory
, and in combination with a noeviction
policy, this may result in errors on job scheduling.
==== Dead Jobs
Do not keep jobs in Dead Jobs. With extended backtrace enabled for Dead Jobs, a single dead job can occupy as much as 20KB in the database.
Re-run the jobs once the root cause is fixed, or delete them.
==== Excessive Arguments
Do not pass an excessive number of arguments to a job.
SomeJob.perform_later(user_name, user_status, user_url, user_info: huge_json)
==== Hordes
Do not schedule hundreds of thousands jobs at once. A single job with no parameters takes 0.5KB. Measure the exact footprint for each job with its arguments.
[#monitoring] === Monitoring
Monitor the server and store historical metrics. Properly configured metrics will provide answers to improve the throughput of job processing.
[#commercial-features] == Commercial Features
At some scale, https://github.com/mperham/sidekiq/wiki/Build-vs-Buy[it pays out to use commercial features].
Some commercial features are available as third-party add-ons. However, their reliability is in most cases questionable.
[#use-batches] === Use Batches
Group jobs related to one task using https://github.com/mperham/sidekiq/wiki/Batches[Sidekiq Batches].
Batch's jobs
method is atomic, i.e., all the jobs are scheduled together, in an all-or-nothing fashion.
class BackfillMissingDataJob < ApplicationJob def self.run_batch Model.where(attribute: nil).find_each do |model| perform_later(model) end end
def perform(model)
end end
class BackfillMissingDataJob < ApplicationJob def self.run_batch batch = Sidekiq::Batch.new batch.description = 'Backfill missing data' batch.on(:success, BackfillComplete, to: SysAdmin.email) batch.jobs do Model.where(attribute: nil).find_each do |model| perform_later(model) end end end
def perform(model)
[#self-scheduling-jobs] === Self-scheduling Jobs
Avoid using self-scheduling jobs for long-running jobs. Prefer using Sidekiq Batches to split the workload.
class BackfillMissingDataJob < ApplicationJob SIZE = 20 def perform(offset = 0) models = Model.where(attribute: nil) .order(:id).offset(offset).limit(SIZE) return if models.empty?
models.each do |model|
model.update!(attribute: for(model))
end
self.class.perform_later(offset + SIZE)
end end
class BackfillMissingDataJob < ApplicationJob def self.run_batch Sidekiq::Batch.new.jobs do Model.where(attribute: nil) .find_in_batches(20) do |models| BackfillMissingDataJob.perform_later(models) end end end
[#api-rate-limited-operations] === API Rate-limited Operations
Most third-party APIs have usage limits and will fail if there are too many calls in a period. Use rate limiting in jobs that make such external calls.
Never rely on the number of jobs to be executed. Even if you schedule jobs to be executed at a specific moment, they might be executed all at once, due to, e.g., a traffic jam in job processing. Use https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting[Enterprise Rate Limiting]. Use the strategy (Concurrent, Bucket, Window) that is most suitable to the specific API rate limiting.
class UpdateExternalDataJob < ApplicationJob def perform(user) new_attribute = ThirdParty::Api.get_attribute(user.external_id) user.update!(attribute: new_attribute) end end
User.where.not(external_id: nil) .find_in_batches.with_index do |group_number, users| users.each do |user| UpdateExternalDataJob .set(wait: group_number.minutes) .perform_later(users) end end
class UpdateExternalDataJob < ApplicationJob LIMITER = Sidekiq::Limiter.window('third-party-attribute-update', 20, :minute, wait_timeout: 0)
def perform(user) LIMITER.within_limit do new_attribute = ThirdParty::Api.get_attribute(user.external_id) user.update!(attribute: new_attribute) end end end
User.where.not(external_id: nil).find_each do |user| UpdateExternalDataJob.perform_later(user) end
[#default-limiter-backoff] === Default Limiter Backoff
Do not rely on Sidekiq's limiter backoff default. It will reschedule the job in five minutes in the future.
It doesn't fit the cases when limits are released quickly or are kept for hours. Configure it on a limiter basis.
Keep in mind how limiter comparison works. Compare limiters by the name, not by the object.
[#reuse-limiters] === Reuse Limiters
Create https://github.com/mperham/sidekiq/wiki/Ent-Rate-Limiting[limiters] once during startup and reuse them. Limiters are thread-safe and designed to be shared.
Each limiter occupies 114 bytes in Redis, and the default TTL is 3 months. 1 million jobs a month using non-shared limiters will be constantly consuming 300MB in Redis.
class SomeJob < ApplicationJob def perform(...) limiter = Sidekiq::Limiter.concurrent('erp', 50, wait_timeout: 0, lock_timeout: 30) limiter.within_limit do
end
end end
class SomeJob < ApplicationJob ERP_LIMIT = Sidekiq::Limiter.concurrent('erp', 50, wait_timeout: 0, lock_timeout: 30)
def perform(...) ERP_LIMIT.within_limit do
end
end end
class SomeJob < ApplicationJob def perform(user)
user_throttle = Sidekiq::Limiter.bucket("stripe-#{user.id}", 30, :second, wait_timeout: 0)
user_throttle.within_limit do
# call stripe with user's account creds
end
[#limiter-options] === Limiter Options
The usage of incorrect limiter options may break its behavior.
==== wait_timeout
Set wait_timeout
to zero or some reasonably low value.
Doing otherwise will result in idle workers, while there might be jobs waiting in the queue.
Keep in mind the backoff configuration, and carefully pick the timing when the job is retried.
==== lock_timeout
for Concurrent Limiter
Set lock_timeout
to a longer than the job executes.
Otherwise, the lock will be released too early and more concurrent jobs will be executed than expected.
[#global-limiting-middleware] === Global Limiting Middleware
The Sidekiq::Limiter::OverLimit
exception might be rescued by jobs to discard themselves from locally defined limiters.
To avoid interference between global throttle limiter middleware and local job limiters, wrap Sidekiq::Limiter::OverLimit
exception in middleware.
class SaturationLimiter SaturationOverLimit = Class.new(StandardError)
def self.wrapper(job, block) LIMITER.within_limit { block.call } rescue Sidekiq::Limiter::OverLimit => e limiter_name = e.limiter.name
# defined on the job level.
raise unless limiter_name == LIMITER.name
# Use a custom exception that Sidekiq::Limiter is using to re-schedule
# the job to a later time, but in a way that doesn't overlap with the
# limiters defined on the job level.
raise SaturationOverLimit, limiter_name
end end
[#ignore-overlimit]
=== Ignore OverLimit
Exceptions on Third-party Services
Sidekiq::Limiter::OverLimit
is an internal mechanism, and it doesn't make sense to report when it triggers.
[#rolling-restarts] === Rolling Restarts
Use https://github.com/mperham/sidekiq/wiki/Ent-Rolling-Restarts[Enterprise Rolling Restarts]. With Rolling Restarts, deployments do not suffer from downtime. Also, it prevents non-atomic and non-idempotent jobs from being interrupted and executed more than once on deployments.
WARNING: For Capistrano-style deployments make sure to use https://github.com/stripe/einhorn#re-exec[`--reexec-as] and https://github.com/stripe/einhorn#options[
--drop-env-var BUNDLE_GEMFILE`] einhorn options to avoid stalled code and dependencies.
[#testing] == Testing
[#perform]
=== perform
Don't use job.perform
or job_class.new.perform
, it bypasses the Active Job serialization/deserialization stage.
Use job_class.perform_now
.
With the implicitly subject, and recommends against using .perform
(that as you correctly mention is exclusively available on a job instance, not class):
perform
method is called directly on an implicitly defined subjectRSpec.describe SomeJob do
subject
is SomeJob.new
it 'updates user status' do expect { subject.perform(user) }.to change { user.status }.to(:updated) } end end
perform
method is called directly on a job instanceRSpec.describe SomeJob do it 'updates user status' do expect { SomeJob.new.perform(user) }.to change { user.status }.to(:updated) } end end
[#perform_later]
=== perform_later
Prefer perform_now
to perform_later
when testing jobs.
It doesn't involve Redis.
RSpec.describe SomeJob do it 'updates user status' do expect do SomeJob.perform_later(user) perform_scheduled_jobs end.to change { user.status }.to(:updated) } end end
== History
This guide came to life as an internal company list of the best practices of working with ActiveJob and Sidekiq. It is compiled from remarks collected from numerous code reviews, and during the migration from another background job processing tool to Sidekiq. Initially created by https://github.com/pirj[Phil Pirozhkov]) with the help of colleagues, and sponsored by https://www.toptal.com[Toptal].
== Contributing
The guide is a work in progress. Improving such guidelines is a great (and simple way) to help the Ruby community!
Nothing written in this guide is set in stone. We desire to work together with everyone interested in gathering the best practices of working with background jobs. The goal is to create a resource that will be beneficial to the entire Ruby community.
Feel free to open tickets or send pull requests with improvements. Thanks in advance for your help!
=== How to Contribute
It's easy, just follow the contribution guidelines below:
== License
image:https://i.creativecommons.org/l/by/3.0/88x31.png[Creative Commons License] This work is licensed under a http://creativecommons.org/licenses/by/3.0/deed.en_US[Creative Commons Attribution 3.0 Unported License]
== Spread the Word
A community-driven style guide is of little use to a community that doesn't know about its existence. Tweet about the guide and share it with your friends and colleagues. Every comment, suggestion, or opinion we get makes the guide just a little bit better. And we want to have the best possible guide, don't we?