Closed jdudley1123 closed 1 year ago
Just a thought before I dig in deeper: can you double check that the queue adapter in your test case is GoodJob's and not the Active Job Test Adapter?
config/environments/test.rb
has config.active_job.queue_adapter = :good_job
, and checking the value of ActiveJob::Base.queue_adapter
during the test returns an instance of ActiveJob::QueueAdapters::GoodJobAdapter
Thanks! Can you doublecheck the value of ActiveDirectoryIdChangeRequestJob.queue_adapter
? I want to specifically rule out:
ActiveDirectoryIdChangeRequestJob.queue_adapter
is also returning an instance of ActiveJob::QueueAdapters::GoodJobAdapter
I'm afraid
Ok, could you try adding your stub assertion directly onto GoodJob._on_thread_error
? (with the leading underscore) That will help me see if the problem here is the configuration initializing at the wrong time.
allow(GoodJob).to receive(:_on_thread_error)
allow(Person).to receive(:find).and_raise(RuntimeError, error_message)
job.perform_later
expect(GoodJob).to have_received(:_on_thread_error).with(error_message)
Updating the test to include those lines:
describe 'when an error occurs during a job' do
it 'logs an error' do
allow(ErrorLogger).to receive(:log)
allow(GoodJob).to receive(:_on_thread_error)
allow(Person).to receive(:find).and_raise(RuntimeError, error_message)
job.perform_later
expect(GoodJob).to have_received(:_on_thread_error).with(error_message)
expect(ErrorLogger).to have_received(:log).with(error_message)
end
end
I get:
Failures:
1) GoodJob when an error occurs during a job logs an error
Failure/Error: expect(GoodJob).to have_received(:_on_thread_error).with(error_message)
(GoodJob)._on_thread_error("An error has occurred!")
expected: 1 time with arguments: ("An error has occurred!")
received: 0 times
Thanks so much for taking the time to investigate this š
Hmm, this is really strange. I'm not quite sure why that isn't being triggered. I just added a test to simulate this on my side: https://github.com/bensheldon/good_job/pull/1104
Would you be able to inspect the enqueued GoodJob Job? You could do GoodJob::Job.last
to give you the record. Does it have the error info?
Inspecting GoodJob::Job.last
after job.perform_later
is called gets me the following, with no evidence of the error:
#<GoodJob::Job:0x00007fad9661deb0
id: "9adb9586-35f9-42e9-94cb-1d180af7b895",
queue_name: "default",
priority: 0,
serialized_params:
{"job_id"=>"9adb9586-35f9-42e9-94cb-1d180af7b895",
"locale"=>"en",
"priority"=>nil,
"timezone"=>"UTC",
"arguments"=>[],
"job_class"=>"ActiveDirectoryIdChangeRequestJob",
"executions"=>0,
"queue_name"=>"default",
"enqueued_at"=>"2023-10-12T09:00:09Z",
"provider_job_id"=>nil,
"exception_executions"=>{}},
scheduled_at: Thu, 12 Oct 2023 09:00:09.654835000 UTC +00:00,
performed_at: Thu, 12 Oct 2023 09:00:09.670314000 UTC +00:00,
finished_at: Thu, 12 Oct 2023 09:00:12.078229000 UTC +00:00,
error: nil,
created_at: Thu, 12 Oct 2023 09:00:09.654835000 UTC +00:00,
updated_at: Thu, 12 Oct 2023 09:00:12.080652000 UTC +00:00,
active_job_id: "9adb9586-35f9-42e9-94cb-1d180af7b895",
concurrency_key: nil,
cron_key: nil,
retried_good_job_id: nil,
cron_at: nil,
batch_id: nil,
batch_callback_id: nil,
is_discrete: true,
executions_count: 1,
job_class: "ActiveDirectoryIdChangeRequestJob",
error_event: nil>
@jdudley1123 huh! That would imply to me that there is a rescue_from
or some other around_perform
block that is intercepting the exception. My reading of that job data is that the job completed successfully (meaning no exception was raised all the way up to the adapter).
Apologies, I think I muddled some things up while trying various different configurations.
Job:
# app/jobs/active_directory_id_change_request_job.rb
class ActiveDirectoryIdChangeRequestJob < ApplicationJob
def perform()
Person.find(1)
end
end
GoodJob config:
# config/initalizers/hood_job.rb
Rails.application.configure do
config.good_job = {
on_thread_error: ->(e) { ErrorLogger.log(e.message) },
}
end
Test:
# spec/jobs/job_error_logging_spec.rb
require 'rails_helper'
RSpec.describe 'GoodJob' do
let(:job) { ActiveDirectoryIdChangeRequestJob } # This could be any job
let(:error_message) { 'An error has occurred!' }
describe 'when an error occurs during a job' do
it 'logs an error' do
allow(ErrorLogger).to receive(:log)
allow(GoodJob).to receive(:_on_thread_error)
allow(Person).to receive(:find).and_raise(RuntimeError, error_message)
begin
job.perform_later
rescue
end
# Checking GoodJob::Job.last here
expect(GoodJob).to have_received(:_on_thread_error).with(error_message)
expect(ErrorLogger).to have_received(:log).with(error_message)
end
end
end
GoodJob::Job.last
output
#<GoodJob::Job:0x00007f8c7d5e4780
id: "1ad5541f-a63b-45d2-85d0-8dcac6da1af7",
queue_name: "default",
priority: 0,
serialized_params:
{"job_id"=>"1ad5541f-a63b-45d2-85d0-8dcac6da1af7",
"locale"=>"en",
"priority"=>nil,
"timezone"=>"UTC",
"arguments"=>[],
"job_class"=>"ActiveDirectoryIdChangeRequestJob",
"executions"=>0,
"queue_name"=>"default",
"enqueued_at"=>"2023-10-19T16:30:32Z",
"provider_job_id"=>nil,
"exception_executions"=>{}},
scheduled_at: Thu, 19 Oct 2023 16:30:32.727841000 UTC +00:00,
performed_at: Thu, 19 Oct 2023 16:30:32.778566000 UTC +00:00,
finished_at: Thu, 19 Oct 2023 16:30:35.056818000 UTC +00:00,
error: "RuntimeError: An error has occurred!",
created_at: Thu, 19 Oct 2023 16:30:32.727841000 UTC +00:00,
updated_at: Thu, 19 Oct 2023 16:30:35.062804000 UTC +00:00,
active_job_id: "1ad5541f-a63b-45d2-85d0-8dcac6da1af7",
concurrency_key: nil,
cron_key: nil,
retried_good_job_id: nil,
cron_at: nil,
batch_id: nil,
batch_callback_id: nil,
is_discrete: true,
executions_count: 1,
job_class: "ActiveDirectoryIdChangeRequestJob",
error_event: 1>
Test outcome:
Failures:
1) GoodJob when an error occurs during a job logs an error
Failure/Error: expect(GoodJob).to have_received(:_on_thread_error).with(error_message)
(GoodJob)._on_thread_error("An error has occurred!")
expected: 1 time with arguments: ("An error has occurred!")
received: 0 times
Sorry, I think I may have also sent you on a goose chase too.
Seeing your begin / rescue
block in your test made me realize: oh, this is running in :inline
mode, which means that GoodJob raises exceptions; GoodJob doesn't report exceptions in :inline
mode to on_thread_error
because it's not running in a background thread.
If you wanted to test that (and I don't believe it's necessary) you could do this as a test:
RSpec.describe do
let(:job) { ActiveDirectoryIdChangeRequestJob } # This could be any job
let(:error_message) { 'An error has occurred!' }
describe 'when an error occurs during a job' do
it 'logs an error' do
capsule = GoodJob::Capsule.new(configuration: GoodJob::Configuration.new({ max_threads: 5, queue_string: '*' }))
adapter = GoodJob::Adapter.new(execution_mode: :async_all, _capsule: capsule)
allow(ActiveDirectoryIdChangeRequestJob).to receive(:queue_adapter).and_return adapter
allow(ErrorLogger).to receive(:log)
allow(Person).to receive(:find).and_raise(RuntimeError, error_message)
job.perform_later
expect(GoodJob::Job.count).to eq 1 # double-check it got enqueued
sleep 0.1 until GoodJob::Job.finished.count == 0
capsule.shutdown
expect(ErrorLogger).to have_received(:log).with(error_message)
end
end
end
Hmm, that test still doesn't pass for me ā the failure is the final assertion about the error logger receiving the error message, which suggests GoodJob still isn't calling on_thread_error
even in :async_all
mode.
Here's my Calendly if we can debug it together: https://calendly.com/bensheldon/office-hours
@jdudley1123 that was a fun and fruitful debugging session!
To summarize, and hopefully come up in search for anyone in the future, we discovered:
The Application and Rspec tests used DatabaseCleaner to create transactional fixtures (DatabaseCleaner.strategy = :transaction
), and disabled Rails' built-in transactional fixtures.
Unfortunately DatabaseCleaner's transactional fixtures do not have the incredibly helpful feature of Rails 5.1+ sharing a database connection across all threads.
The core problem: Within the test, the job was enqueued and the job record inserted in the test, in an open transaction within one database connection. But then a background thread used a different database connection to try to query the job record, and never found it, because it was in the transaction opened by the other database connection.
My suggestion was to remove DatabaseCleaner because I don't think it's necessary anymore because Rails test fixtures are so good and work with System/Integration tests, and if you do truly need to truncate, it's rather trivial to do directly with Rails as GoodJob does itself.
Thanks so much for all your help @bensheldon! The test passes nicely now, and removing Database Cleaner had no effect on the rest of the suite.
I have GoodJob (v3.19.4) on Rails (v6.1.7.6) configured to log errors:
That all seems to work fine, but when I try to write a test for it, the test always fails.
However, if I do the error logging in
ApplicationJob
, the test passes:That suggests to me that
GoodJob.on_thread_error
isn't being called, but I can't figure out why not. The documentation says that it won't be called if there are an infinite number of retries, but in my set up there are no retries. What am I missing?