mhenrixon / sidekiq-unique-jobs

Prevents duplicate Sidekiq jobs
MIT License
1.46k stars 277 forks source link

Unexpected behavior with until_and_while_executing #711

Open baka-san opened 2 years ago

baka-san commented 2 years ago

Describe the bug I'm not sure if this is a bug or if I'm misunderstanding the documentation/usage of :until_and_while_executing. While :until_and_while_executing properly locks jobs which are queued but haven't started executing (e.g. perform_in(1.hour, ...)), it does not seem to correctly lock through execution of the job.

Expected behavior From reading the documentation, I'd expect :until_and_while_executing to lock until execution has started (see until_executed) AND through execution of the job (see while_executing).

Current behavior The first subsequent job will be cancelled correctly, while the next job will not be cancelled, and so on alternating in that manner, i.e.

TestJob.peform_async # I succeed
TestJob.peform_async # I am locked out
TestJob.peform_async # I succeed
TestJob.peform_async # I am locked out
TestJob.peform_async # I succeed
TestJob.peform_async # I am locked out
...

Rails console

The IDs correspond with the Sidekiq output provided below.

Screen Shot 2022-05-09 at 16 58 25

Sidekiq output

I, [2022-05-09T16:57:06.011966 #91225]  INFO -- : Booted Rails 6.1.4.1 application in development environment
I, [2022-05-09T16:57:06.012311 #91225]  INFO -- : Running in ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-darwin20]
I, [2022-05-09T16:57:06.012377 #91225]  INFO -- : See LICENSE and the LGPL-3.0 for licensing details.
I, [2022-05-09T16:57:06.012436 #91225]  INFO -- : Upgrade to Sidekiq Pro for more features and support: https://sidekiq.org
2022-05-09T07:57:06.028Z pid=91225 tid=2151 DEBUG: Executed update_version.lua in 8ms
I, [2022-05-09T16:57:06.029364 #91225]  INFO -- : Start - Upgrading Locks
I, [2022-05-09T16:57:06.029415 #91225]  INFO -- : Start - Converting v6 locks to v7
I, [2022-05-09T16:57:06.029859 #91225]  INFO -- : Done - Converting v6 locks to v7
I, [2022-05-09T16:57:06.029926 #91225]  INFO -- : Start - Deleting v6 keys
I, [2022-05-09T16:57:06.031435 #91225]  INFO -- : Done - Deleting v6 keys
I, [2022-05-09T16:57:06.032151 #91225]  INFO -- : Done - Upgrading Locks
I, [2022-05-09T16:57:06.033966 #91225]  INFO -- : Starting Reaper
I, [2022-05-09T16:57:06.034454 #91225]  INFO -- : Starting processing, hit Ctrl-C to stop
I, [2022-05-09T16:57:06.043081 #91225]  INFO -- : Nothing to delete; exiting.
I, [2022-05-09T16:57:25.801900 #91225]  INFO -- : start
2022-05-09T07:57:25.950Z pid=91225 tid=2a2d class=TestJob jid=5689a6afb32bb04ee6ac7029 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 4ms
2022-05-09T07:57:25.955Z pid=91225 tid=2a2d class=TestJob jid=5689a6afb32bb04ee6ac7029 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 4ms
2022-05-09T07:57:25.957Z pid=91225 tid=2a2d class=TestJob jid=5689a6afb32bb04ee6ac7029 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed lock.lua in 2ms
I, [2022-05-09T16:57:25.958052 #91225]  INFO -- : === WENT TO SLEEP AT 2022-05-09 16:57:25 +0900 ===
I, [2022-05-09T16:57:29.200526 #91225]  INFO -- : start
2022-05-09T07:57:29.204Z pid=91225 tid=2a05 class=TestJob jid=51d7d5cff51302aabc4ff927 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 0ms
2022-05-09T07:57:29.208Z pid=91225 tid=2a05 class=TestJob jid=51d7d5cff51302aabc4ff927 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 4ms
I, [2022-05-09T16:57:29.208632 #91225]  INFO -- : Adding dead TestJob job 51d7d5cff51302aabc4ff927
2022-05-09T07:57:29.218Z pid=91225 tid=2a05 class=TestJob jid=51d7d5cff51302aabc4ff927 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed delete_by_digest.lua in 9ms
I, [2022-05-09T16:57:29.219059 #91225]  INFO -- : delete_by_digest(uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN) completed in 9ms
I, [2022-05-09T16:57:29.219984 #91225]  INFO -- : done
I, [2022-05-09T16:57:32.783395 #91225]  INFO -- : start
2022-05-09T07:57:32.788Z pid=91225 tid=2a05 class=TestJob jid=d0f08f9b041fa2af4b1480e2 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 1ms
2022-05-09T07:57:32.790Z pid=91225 tid=2a05 class=TestJob jid=d0f08f9b041fa2af4b1480e2 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 2ms
2022-05-09T07:57:32.794Z pid=91225 tid=2a05 class=TestJob jid=d0f08f9b041fa2af4b1480e2 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed lock.lua in 1ms
I, [2022-05-09T16:57:32.795051 #91225]  INFO -- : === WENT TO SLEEP AT 2022-05-09 16:57:32 +0900 ===
I, [2022-05-09T16:57:37.147711 #91225]  INFO -- : start
2022-05-09T07:57:37.151Z pid=91225 tid=2a65 class=TestJob jid=b1943fd1926e317418bffc9c uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 1ms
2022-05-09T07:57:37.153Z pid=91225 tid=2a65 class=TestJob jid=b1943fd1926e317418bffc9c uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 2ms
I, [2022-05-09T16:57:37.153571 #91225]  INFO -- : Adding dead TestJob job b1943fd1926e317418bffc9c
2022-05-09T07:57:37.154Z pid=91225 tid=2a65 class=TestJob jid=b1943fd1926e317418bffc9c uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed delete_by_digest.lua in 1ms
I, [2022-05-09T16:57:37.155008 #91225]  INFO -- : delete_by_digest(uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN) completed in 1ms
I, [2022-05-09T16:57:37.162971 #91225]  INFO -- : done
I, [2022-05-09T16:57:42.646882 #91225]  INFO -- : start
2022-05-09T07:57:42.656Z pid=91225 tid=2a5d class=TestJob jid=b178f6061059795239c1c92a uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 1ms
2022-05-09T07:57:42.657Z pid=91225 tid=2a5d class=TestJob jid=b178f6061059795239c1c92a uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 0ms
2022-05-09T07:57:42.659Z pid=91225 tid=2a5d class=TestJob jid=b178f6061059795239c1c92a uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed lock.lua in 1ms
I, [2022-05-09T16:57:42.659622 #91225]  INFO -- : === WENT TO SLEEP AT 2022-05-09 16:57:42 +0900 ===
I, [2022-05-09T16:57:49.352436 #91225]  INFO -- : start
2022-05-09T07:57:49.355Z pid=91225 tid=2a65 class=TestJob jid=d6ed1322b22058b14a9d0b39 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 1ms
2022-05-09T07:57:49.357Z pid=91225 tid=2a65 class=TestJob jid=d6ed1322b22058b14a9d0b39 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 1ms
I, [2022-05-09T16:57:49.357213 #91225]  INFO -- : Adding dead TestJob job d6ed1322b22058b14a9d0b39
2022-05-09T07:57:49.358Z pid=91225 tid=2a65 class=TestJob jid=d6ed1322b22058b14a9d0b39 uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed delete_by_digest.lua in 0ms
I, [2022-05-09T16:57:49.358624 #91225]  INFO -- : delete_by_digest(uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN) completed in 0ms
I, [2022-05-09T16:57:49.360029 #91225]  INFO -- : done
I, [2022-05-09T16:58:25.963455 #91225]  INFO -- : === WOKE UP AT 2022-05-09 16:58:25 +0900 ===
I, [2022-05-09T16:58:25.964842 #91225]  INFO -- : done
I, [2022-05-09T16:58:32.801979 #91225]  INFO -- : === WOKE UP AT 2022-05-09 16:58:32 +0900 ===
I, [2022-05-09T16:58:32.817964 #91225]  INFO -- : done
I, [2022-05-09T16:58:42.665078 #91225]  INFO -- : === WOKE UP AT 2022-05-09 16:58:42 +0900 ===
I, [2022-05-09T16:58:42.677064 #91225]  INFO -- : done
I, [2022-05-09T17:02:59.172800 #91225]  INFO -- : start
2022-05-09T08:02:59.177Z pid=91225 tid=2a05 class=TestJob jid=593b5e6f5be2576a2af3d8bf uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970 DEBUG: Executed unlock.lua in 0ms
2022-05-09T08:02:59.179Z pid=91225 tid=2a05 class=TestJob jid=593b5e6f5be2576a2af3d8bf uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed queue.lua in 1ms
2022-05-09T08:02:59.181Z pid=91225 tid=2a05 class=TestJob jid=593b5e6f5be2576a2af3d8bf uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed lock.lua in 0ms
I, [2022-05-09T17:02:59.181403 #91225]  INFO -- : === WENT TO SLEEP AT 2022-05-09 17:02:59 +0900 ===
I, [2022-05-09T17:03:59.187243 #91225]  INFO -- : === WOKE UP AT 2022-05-09 17:03:59 +0900 ===
2022-05-09T08:03:59.188Z pid=91225 tid=2a05 class=TestJob jid=593b5e6f5be2576a2af3d8bf uniquejobs=server until_and_while_executing=uniquejobs:fd25cf3971a244b8d59c48317e631970:RUN DEBUG: Executed unlock.lua in 0ms
I, [2022-05-09T17:03:59.189842 #91225]  INFO -- : done
^CI, [2022-05-09T17:06:23.505453 #91225]  INFO -- : Shutting down
I, [2022-05-09T17:06:23.506863 #91225]  INFO -- : Scheduler exiting...
I, [2022-05-09T17:06:23.506958 #91225]  INFO -- : Terminating quiet threads
I, [2022-05-09T17:06:23.507054 #91225]  INFO -- : Scheduler exiting...
I, [2022-05-09T17:06:23.507809 #91225]  INFO -- : Stopping Reaper
I, [2022-05-09T17:06:23.611472 #91225]  INFO -- : Pausing to allow jobs to finish...
I, [2022-05-09T17:06:25.382892 #91225]  INFO -- : Bye!

Sidekiq panel

Note that complete here is indicated the job is dead, as handled by sidekiq-status.

Screen Shot 2022-05-09 at 16 58 02

config/initializers/sidekiq.rb

require "sidekiq"
require "sidekiq-status"
require "sidekiq-unique-jobs"

Sidekiq.configure_server do |config|
  config.redis = { url: Rails.application.credentials.redis[:full_url] }

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
    chain.add Sidekiq::Status::ClientMiddleware, expiration: 90.days
  end

  config.server_middleware do |chain|
    chain.add Sidekiq::Status::ServerMiddleware, expiration: 90.days
    chain.add SidekiqUniqueJobs::Middleware::Server
  end

  SidekiqUniqueJobs::Server.configure(config)
end

Sidekiq.configure_client do |config|
  config.redis = { url: Rails.application.credentials.redis[:full_url] }

  config.client_middleware do |chain|
    chain.add SidekiqUniqueJobs::Middleware::Client
    chain.add Sidekiq::Status::ClientMiddleware, expiration: 90.days
  end
end

app/sidekiq/test_job.rb

class TestJob
  include Sidekiq::Job
  include Sidekiq::Status::Worker
  include Sidekiq::Worker

  sidekiq_options(
    retry: 0,
    queue: "default",
    lock: :until_and_while_executing,
    # lock_timeout: nil/>0, # has no effect
    on_conflict: { client: :log, server: :reject }
  )

  def perform
    logger.info "=== WENT TO SLEEP AT #{Time.now} ==="
    sleep 60
    logger.info "=== WOKE UP AT #{Time.now} ==="
  end
end

Additional context Desired usage:
1. Cron schedules a job MyJob every ~10 minutes to update data and do various work.
2. The amount of data handled via MyJob can vary considerably and thus may take over 10 minutes.
3a. In the case that 10 minutes has elapsed and the previous MyJob is complete, MyJob should be run again.
3b. In the case that MyJob had a lot of work to do and still hasn't finished after 10 minutes, the new MyJob should be cancelled and not retried. The same process will repeat every 10 minutes until the original MyJob instance has finished and a new one is then executed.

mhenrixon commented 2 years ago

Please checkout v7.1.26 https://github.com/mhenrixon/sidekiq-unique-jobs/releases/tag/v7.1.26 as it has quite a few fixes for until and while executing

mhenrixon commented 2 years ago

Also, if you are still having trouble: https://github.com/mhenrixon/sidekiq-unique-jobs/releases/tag/v7.1.27 (especially if you are using redis-namespace).

mario-amazing commented 1 year ago

IMHO, the expected behavior is that a new job can only be created after the previous job is completed, @mhenrixon, am I right? Cuz I can create 2 jobs(1 runinng 1 pending), how can I fix it?

class XlsxAppointmentsUpserterJob
  include Sidekiq::Job

  sidekiq_options queue: 'exports', retry: 5, lock: :until_and_while_executing, on_conflict: { client: :log, server: :reject }

  def perform
    require 'pry'; binding.pry
  end
end

https://user-images.githubusercontent.com/9537753/217018449-f9747ebd-0c53-4c29-838e-d446877a80dd.mov

Rails 7 sidekiq (7.0.3) sidekiq-unique-jobs (8.0.0)

mhenrixon commented 1 year ago

Cuz I can create 2 jobs(1 runinng 1 pending), how can I fix it?

Use until_executed

jpriollaud commented 2 weeks ago

This also threw me for quite a while. I expected "new job can only be created after the previous job is completed". This is probably the cause of a lot of confusion and open issues around until_and_while_executing

And the readme appears incorrect: https://github.com/mhenrixon/sidekiq-unique-jobs?tab=readme-ov-file#until-executed

The lock is not removed until the perform is actually completed.

MeterSoft commented 1 week ago

have configuration like this sidekiq_options lock: :until_executed But when i create new job it is able to add it to queue and start multiple times becaue each new job has different digest. My job params is empty and i dont know why digest is different.

So after investigation i found that it use all lock args

  [{"job_class"=>"Test::PortfolioJob",
    "job_id"=>"26ee041e-d8d9-40e9-bc70-ed0ded2ed3ee",
    "provider_job_id"=>nil,
    "queue_name"=>"default",
    "priority"=>nil,
    "arguments"=>[],
    "executions"=>0,
    "exception_executions"=>{},
    "locale"=>"en",
    "timezone"=>"UTC",
    "enqueued_at"=>"2024-09-04T11:22:16.046505000Z",
    "scheduled_at"=>"2024-09-04T11:23:16.046320000Z"}]}

That is reason why digest is different and it allow multiple jobs

I can not find way how generate it only based on queue and class name