Open carstengehling opened 10 years ago
First try upgrading. If the problem persists, reopen
Sorry - I've tried updating to 4.0.1 and the problem still persist.
2014-05-12T16:57:24+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:24814)] Job Class#perform (id=19361) RUNNING
2014-05-12T16:57:24+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:24807)] Job Class#perform (id=19361) RUNNING
2014-05-12T16:57:27+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:24807)] Job Class#perform (id=19361) COMPLETED after 3.6808
2014-05-12T16:57:28+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:24814)] Job Class#perform (id=19361) COMPLETED after 3.7396
How do I reopen?
What DJ configuration options do you have set? What is the exact command you are using to run DJ?
From config/initializers/delayed_job.rb:
Delayed::Worker.max_run_time = 24.hours
Delayed::Worker.max_attempts = 1
Delayed::Worker.destroy_failed_jobs = false
The workers are started with:
script/delayed_job start -n 2
I just tried with the latest commits on master branch - same result:
2014-05-12T18:44:24+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:27915)] Starting job worker
2014-05-12T18:44:24+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:27922)] Starting job worker
2014-05-12T18:44:26+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:27915)] Job Class#perform (id=19788) RUNNING
2014-05-12T18:44:26+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:27922)] Job Class#perform (id=19788) RUNNING
2014-05-12T18:44:28+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:27915)] Job Class#perform (id=19788) COMPLETED after 1.9184
2014-05-12T18:44:28+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:27922)] Job Class#perform (id=19788) COMPLETED after 1.9570
2014-05-12T18:44:30+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:27915)] Job Class#perform (id=19795) RUNNING
2014-05-12T18:44:30+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:27922)] Job Class#perform (id=19795) RUNNING
2014-05-12T18:44:31+0200: [Worker(delayed_job.0 host:carsten-ThinkPad-X1-Carbon pid:27915)] Job Class#perform (id=19795) COMPLETED after 0.6636
2014-05-12T18:44:31+0200: [Worker(delayed_job.1 host:carsten-ThinkPad-X1-Carbon pid:27922)] Job Class#perform (id=19795) COMPLETED after 0.7026
The job is defined like this:
class AnalysisJob
def self.perform(analysis_id)
analysis = Analysis.find(analysis_id)
if analysis.transport_form == 'private'
PrivateTransportAnalysisProcess.new(analysis).run
else
PublicTransportAnalysisProcess.new(analysis).run
end
end
end
The jobs are created like this:
AnalysisJob.delay.perform(id)
@carstengehling can you create a small sample app that reproduces the issue?
similar thing happening to me :/
I have a lead...
Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction: DELETE FROM `delayed_jobs` WHERE `delayed_jobs`.`id` = 12345
Job runs and fails during row delete, another worker eventually picks it up and runs the job again. Noticed this about 10 times while our queue was processing approximately 4000 jobs. We have 6 workers.
delayed_job (4.0.6) btw
I've added the following index to match the WHERE clause that DJ is using for the row locking and unlocking. This should help with the speed of locking and unlocking rows.
add_index :delayed_jobs, [:run_at, :locked_at, :locked_by, :failed_at], :name => 'index_delayed_jobs_for_better_row_locking'
@jnimety
Running multiple workers, we're running into this Deadlock error frequently as well when there are a ton of jobs in the queue.
mysql2
delayed_job 4.0.6
daemons 1.2.2
Worker(host:app01 pid:4191)] Job Searchable::SearchDestroyJob (id=15992398) FAILED (0 prior attempts) with ActiveRecord::StatementInvalid: Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction: DELETE FROM `delayed_jobs` WHERE `delayed_jobs`.`id` = 15992398
rake aborted!
ActiveRecord::StatementInvalid: Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction: UPDATE `delayed_jobs` SET `last_error` = 'Mysql2::Error: Deadlock found when trying to get lock; try restarting transaction: DELETE FROM `delayed_jobs` WHERE `delayed_jobs`.`id` = 15992398\n
which is causing the workers to crash and require a restart. (we're using the --monitor
option so this happens automatically after some amount of time)
Looks related to https://github.com/collectiveidea/delayed_job_active_record/issues/63
I observed the same symptoms.
Take a look at: https://github.com/collectiveidea/delayed_job_active_record/blob/master/lib/delayed/backend/active_record.rb#L90
I am using the activerecord-postgis-adapter. So for me: ActiveRecord::Base.connection.adapter_name => "PostGIS"
I monkey patched this and have not seen any 2 workers picking up the same job since then.
hth
@sventantau We just ran into this same thing. Thank you for posting this, I would not have found this so soon. I will put a pull request for this.
_Updated: tested with delayedjob 4.0.1 - problem still persists
With the gems:
and PostgreSQL v. 9.1.11
Running 2 workers, they pick up and process the same job:
Is there a way around this?