bensheldon / good_job

Multithreaded, Postgres-based, Active Job backend for Ruby on Rails.
https://goodjob-demo.herokuapp.com/
MIT License
2.53k stars 190 forks source link

"Couldn't find GoodJob::BatchRecord" error #1387

Closed jesseduffield closed 1 week ago

jesseduffield commented 1 week ago

I'm getting this error on some of my batch callback jobs:

{"_aj_globalid"=>"gid://subble-api/GoodJob::Batch/e4b54d15-a2a4-4778-bf3c-5ce1de3f366f"}, {"event"=>{"value"=>"finish", "_aj_serialized"=>"ActiveJob::Serializers::SymbolSerializer"}, "_aj_symbol_keys"=>["event"]}

Error: ActiveJob::DeserializationError: Error while trying to deserialize arguments: Couldn't find GoodJob::BatchRecord with 'id'=e4b54d15-a2a4-4778-bf3c-5ce1de3f366f

/usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/core.rb:253:in `find'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/batch.rb:67:in `find'   /usr/local/bundle/ruby/3.3.0/gems/globalid-1.2.1/lib/global_id/locator.rb:162:in `locate'   /usr/local/bundle/ruby/3.3.0/gems/globalid-1.2.1/lib/global_id/locator.rb:210:in `block in locate'   /usr/local/bundle/ruby/3.3.0/gems/globalid-1.2.1/lib/global_id/locator.rb:222:in `unscoped'   /usr/local/bundle/ruby/3.3.0/gems/globalid-1.2.1/lib/global_id/locator.rb:210:in `locate'   /usr/local/bundle/ruby/3.3.0/gems/globalid-1.2.1/lib/global_id/locator.rb:31:in `locate'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/arguments.rb:145:in `deserialize_global_id'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/arguments.rb:129:in `deserialize_argument'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/arguments.rb:43:in `block in deserialize'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/arguments.rb:43:in `map'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/arguments.rb:43:in `deserialize'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/core.rb:210:in `deserialize_arguments'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/core.rb:200:in `deserialize_arguments_if_needed'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/execution.rb:50:in `perform_now'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/instrumentation.rb:26:in `block in perform_now'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/railties/job_runtime.rb:13:in `block in instrument'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/instrumentation.rb:40:in `block in instrument'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications.rb:206:in `block in instrument'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications/instrumenter.rb:58:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications.rb:206:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/instrumentation.rb:39:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/railties/job_runtime.rb:11:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/instrumentation.rb:26:in `perform_now'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/logging.rb:18:in `block in perform_now'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/tagged_logging.rb:135:in `block in tagged'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/tagged_logging.rb:39:in `tagged'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/tagged_logging.rb:135:in `tagged'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/broadcast_logger.rb:240:in `method_missing'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/logging.rb:25:in `tag_logger'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/logging.rb:18:in `perform_now'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/execution.rb:30:in `block in execute'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:121:in `block in run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/railtie.rb:67:in `block (4 levels) in <class:Railtie>'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/reloader.rb:77:in `block in wrap'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/execution_wrapper.rb:88:in `wrap'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/reloader.rb:74:in `wrap'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/railtie.rb:66:in `block (3 levels) in <class:Railtie>'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:130:in `instance_exec'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:130:in `block in run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:141:in `run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/activejob-7.1.3.4/lib/active_job/execution.rb:28:in `execute'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:414:in `block (3 levels) in perform'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications.rb:206:in `block in instrument'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications/instrumenter.rb:58:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/notifications.rb:206:in `instrument'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:413:in `block (2 levels) in perform'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/lib/good_job/current_thread.rb:113:in `within'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:376:in `block in perform'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:121:in `block in run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/batch.rb:78:in `within_thread'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:576:in `reset_batch_values'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:130:in `block in run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/callbacks.rb:141:in `run_callbacks'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:371:in `perform'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:273:in `block in perform_with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/concerns/good_job/advisory_lockable.rb:172:in `block in with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:929:in `_scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:467:in `scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/scoping/default.rb:51:in `unscoped'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/concerns/good_job/advisory_lockable.rb:172:in `with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation/delegation.rb:79:in `block in with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:929:in `_scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:467:in `scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation/delegation.rb:79:in `with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/app/models/good_job/execution.rb:269:in `perform_with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation/delegation.rb:79:in `block in perform_with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:929:in `_scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation.rb:467:in `scoping'   /usr/local/bundle/ruby/3.3.0/gems/activerecord-7.1.3.4/lib/active_record/relation/delegation.rb:79:in `perform_with_advisory_lock'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/lib/good_job/job_performer.rb:33:in `next'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/lib/good_job/scheduler.rb:276:in `block (2 levels) in create_task'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/reloader.rb:77:in `block in wrap'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/execution_wrapper.rb:92:in `wrap'   /usr/local/bundle/ruby/3.3.0/gems/activesupport-7.1.3.4/lib/active_support/reloader.rb:74:in `wrap'   /usr/local/bundle/ruby/3.3.0/gems/good_job-3.28.0/lib/good_job/scheduler.rb:275:in `block in create_task'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:24:in `block in execute'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `block in synchronize'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/synchronization/mutex_lockable_object.rb:48:in `synchronize'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/safe_task_executor.rb:22:in `execute'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/ivar.rb:170:in `safe_execute'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/scheduled_task.rb:298:in `process_task'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/timer_set.rb:166:in `block (2 levels) in process_tasks'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:359:in `run_task'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:350:in `block (3 levels) in create_worker'   <internal:kernel>:187:in `loop'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:341:in `block (2 levels) in create_worker'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `catch'   /usr/local/bundle/ruby/3.3.0/gems/concurrent-ruby-1.3.1/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:340:in `block in create_worker'``

This happened when I was having some DB outage issues, not sure if it's related.

bensheldon commented 1 week ago

That's strange! That looks like a Batch callback job tried to run, and it was unable to load the associated Batch record out of the database.

Are you still experiencing the error / did that job ever successfully complete?

It's possible this was related to your DB outage. Normally though the cause of that exception would look something like (example):


ActiveJob::DeserializationError: Error while trying to deserialize arguments: PG::ConnectionBad: PQsocket() can't get socket descriptor`
jesseduffield commented 1 week ago

I'm no longer experiencing the error, so the issue may indeed have been specific to the outage. I'm happy to close this and report back if it happens again.

jesseduffield commented 5 days ago

@bensheldon I've just had this error happen again: this time without any outage or DB issues. I'm wondering if the cause could be due to a job in the batch retrying for a while before finally succeeding? Perhaps the batch is cleaned up in the meantime, so that when the retrying job eventually succeeds, the batch is gone?

bensheldon commented 4 days ago

Hmm. That's definitely possible if the batch callback job doesn't successfully complete before the batch is deleted. The default time to preserve jobs (the same setting is used for batches) is 2 weeks.