This PR introduces a change to avoid calling ActiveJob::Base#deserialize during queueing the job whilst maintaining existing functionality.
The change is in JobWrapper, which previously only took in a hash of job_data and knew how to deserialize it into a job object so we can interrogate it for queue_name or max_attempts, etc. To avoid the roundtrip through serialization, we change JobWrapper#initialize to take in either the job object directly, or a hash as it did previously. When it's a hash (needed when running the job in the worker), it still deserializes it into a job object. When it's a job object just assign it to @job directly and continue.
We discovered this through having a second-order effect in #deserialize using it to trigger some logic as the job comes off the queue, but we were then seeing the logic run multiple times for one job execution. Tracked it back to the JobClass.perform_later calling #deserialize and being majorly confused why it was deserialising in the web process.
This will make enqueueing jobs faster too, because we don't need a roundtrip through #serialize, #deserialize as well as allocating less objects. In practice this is likely a small effect, but we get it for free avoiding the behaviour above.
Looking at all the built-in adapters in ActiveJob, as well as good_job and solid_queue, they all avoid calling job.serialize during the enqueuing, even those adapters that support invoking methods on the Job class instance like delayed does.
From one angle it's an optimization, from another angle it's a bugfix (:
This PR introduces a change to avoid calling
ActiveJob::Base#deserialize
during queueing the job whilst maintaining existing functionality.The change is in
JobWrapper
, which previously only took in a hash ofjob_data
and knew how to deserialize it into a job object so we can interrogate it forqueue_name
ormax_attempts
, etc. To avoid the roundtrip through serialization, we changeJobWrapper#initialize
to take in either the job object directly, or a hash as it did previously. When it's a hash (needed when running the job in the worker), it still deserializes it into a job object. When it's a job object just assign it to@job
directly and continue.We discovered this through having a second-order effect in
#deserialize
using it to trigger some logic as the job comes off the queue, but we were then seeing the logic run multiple times for one job execution. Tracked it back to theJobClass.perform_later
calling#deserialize
and being majorly confused why it was deserialising in the web process.This will make enqueueing jobs faster too, because we don't need a roundtrip through
#serialize
,#deserialize
as well as allocating less objects. In practice this is likely a small effect, but we get it for free avoiding the behaviour above.Looking at all the built-in adapters in ActiveJob, as well as good_job and solid_queue, they all avoid calling
job.serialize
during the enqueuing, even those adapters that support invoking methods on the Job class instance like delayed does.From one angle it's an optimization, from another angle it's a bugfix (: