ixti / sidekiq-throttled

Concurrency and rate-limit throttling for Sidekiq
MIT License
698 stars 75 forks source link

Allow configuring whether throttled jobs are put back on the queue immediately or scheduled for the future #150

Open lavaturtle opened 1 year ago

lavaturtle commented 1 year ago

This adds support for setting the requeue_strategy for a job, to specify what we should do with throttled jobs. The default is :enqueue, which is the current behavior: re-add it to the end of the queue. The other option is :schedule, which schedules the job for a time in the future when we think we'll have capacity to process it.

It's also possible to set the default_requeue_strategy in the configuration, to set this behavior for all jobs that do not individually specify a requeue_strategy.

This may be relevant to Issue #36.

Unrelatedly, there's a commit in here that changes the format of some require_relative lines, to comply with the new Rubocop rule Style/RedundantCurrentDirectoryInPath. I don't feel strongly about this commit; I only added it so that rubocop would pass.

ixti commented 1 year ago

That is awesome! Thank you! I will take a look this weekend.

ixti commented 1 year ago

Re: Style/RedundantCurrentDirectoryInPath

I prefer consistency. It is pretty weird to me that rubocop's default for rescue is to be explicit, while for require_relative it's implicit. Let's just disable this cop. I will find time to open a rubocop PR to make that configurable. But for now, I would like it to be simply disabled.

On a side note about this PR, I think I made a bad decision back when I started this gem on providing custom sidekiq_throttle singleton class message, probably we should utilize sidekiq_class_attribute helper instead:

def self.included(base)
  base.sidekiq_class_attribute :sidekiq_throttle_push_back # :enqueue | :schedule
end

In this case we will not need to bother about "keeping track" of job class inheritance.

lavaturtle commented 1 year ago

Re: Style/RedundantCurrentDirectoryInPath

I prefer consistency. It is pretty weird to me that rubocop's default for rescue is to be explicit, while for require_relative it's implicit. Let's just disable this cop. I will find time to open a rubocop PR to make that configurable. But for now, I would like it to be simply disabled.

Makes sense! I've updated the PR to disable the cop instead of applying it.

lavaturtle commented 1 year ago

On a side note about this PR, I think I made a bad decision back when I started this gem on providing custom sidekiq_throttle singleton class message, probably we should utilize sidekiq_class_attribute helper instead:

Ah, interesting! I didn't know about sidekiq_class_attribute. So if we do that, then a job class can specify sidekiq_throttle_push_back in its sidekiq_options, and then we can read it later with get_sidekiq_options?

Would you prefer push_back as the setting name rather than requeue_strategy?

ixti commented 1 year ago

sidekiq_class_attribute :sidekiq_throttle_push_back will define a new singleton class method, so the usage will look like:

class MyJob
  include Sidekiq::Job
  include Sidekiq::Throttled::Job

  sidekiq_throttled_push_back :enqueue
end

Naturally, we should check if the value is Proc, and if so - we should call that proc to get the value, so that it will be possible to:

class MyJob
  include Sidekiq::Job
  include Sidekiq::Throttled::Job

  sidekiq_throttled_push_back ->(user_id, *) { user_id.odd? ? :enqueue : :schedule }

  def perform(user_id, more, stuff)
    # ...
  end
end
ixti commented 1 year ago

I have no strong feelings about push_back vs requeue_strategy. I just don't like over-use of strategy in names (easy to confuse with Throttled::Strategy). Probably simply: sidekiq_throttled_requeue_with?

Also, just realized, that it would be nice to also allow specifying which queue it should be requeued to.

lavaturtle commented 1 year ago

Hmm, I've been working on getting the sidekiq_class_attribute approach to work, and as far as I can tell from my testing, if we do this in Sidekiq::Throttled::Job:

      def self.included(worker)
        worker.send(:extend, ClassMethods)
        worker.sidekiq_class_attribute :sidekiq_throttled_requeue_with  # :enqueue | :schedule
      end

then the job class needs to use it like this:

      class MyThrottledJob
        include Sidekiq::Job
        include Sidekiq::Throttled::Job

        self.sidekiq_throttled_requeue_with = :schedule
        sidekiq_throttle foo: :bar

i.e. it ends up as a class-level variable that needs to be assigned, rather than being something we can use like sidekiq_throttled_requeue_with :schedule.

lavaturtle commented 1 year ago

So maybe we invoke sidekiq_class_attribute for the inheritance benefits, but instead of users using sidekiq_throttled_requeue_with in the class directly, we keep the requeue_with option for sidekiq_throttle? That also avoids any weirdness around which order the two are called in.

lavaturtle commented 1 year ago

Okay! I've made a number of changes. The basic usage should now look like this:

  sidekiq_throttle threshold: {limit: 123, period: 1.hour}, requeue: {to: :other_queue, with: :schedule}

with support for Procs:


class MyJob
  include Sidekiq::Job
  include Sidekiq::Throttled::Job

  sidekiq_throttle threshold: {limit: 123, period: 1.hour},
                   requeue: {to: ->(user_id, *) { user_id.odd? ? :odd_queue, :even_queue },
                             with: ->(_user_id, more, *) { more > 25 ? :schedule : :enqueue }}

  def perform(user_id, more, stuff)
    # ...
  end
end

and the default configuration looks like this:

Sidekiq::Throttled.configuration.default_requeue_options = {with: :schedule, to: :my_throttled_jobs_queue}

Both :with and :to arguments are optional. If :with is not specified, we'll use the :enqueue method. If :to is not specified, we'll use the queue the job was originally enqueued into.

danpolyuha commented 1 year ago

Hey @ixti , Thanks for your amazing work on this gem. Sorry for asking here, just don't want to create new threads, Do you have any estimate on release of 1.0.0 stable?

woodhull commented 1 year ago

Excited to land this PR both for the app I help maintain with @lavaturtle and another product I'm working on. Anything else we can do to help move this and v1 across the finish line? Happy to put some more of our engineering time in on our side if there are discrete tasks you can point us at. We're keen to invest in the gem & code quality.

ixti commented 1 year ago

@lavaturtle @woodhull thanks for the amazing PR. I will try to go through it today-tomorrow. Sorry for being unresponsive (was really overwhelmed with some work lately - but slowly getting back at dedicating time to sidekiq-throttled).

ixti commented 1 year ago

The only issue I'm scratching my head about right now is the Sidekiq-Pro support. Can't figure out how to make sure it works correctly. Thus thinking on skipping it's support for 1.0.0 release completely and circle back at that task after 1.0.0 release.

mnovelo commented 1 year ago

@ixti I'm happy to help with the Sidekiq Pro support, regardless of whether it's for 1.0.0 or later! Let me know if there's a branch you'd like me to test and/or review

woodhull commented 11 months ago

Hey.... Just looping back to this.

We're not using Sidekiq Pro so we aren't sure how to help move this forward on that front. That code is private for paid Sidekiq customers I think?

mnovelo commented 11 months ago

@woodhull yeah, the Sidekiq Pro code is private for paid customers.

@ixti I'm still happy to help with the Sidekiq Pro support, especially if it's blocking the 1.0.0 release. At the same time, I fully support releasing 1.0.0 without support for Sidekiq Pro's super_fetch. It's a configuration option that we're not using yet. We only use batching from Sidekiq Pro ourselves right now.

JamesWatling commented 10 months ago

Is this going to be released? I love this strategy, we're having a severly throttled queue, and it is degrading our other queues and jobs, would love this strategy!

ixti commented 9 months ago

@JamesWatling yes. Working to merge this in the next couple of days.

ixti commented 9 months ago

This PR will help to mitigate #86 as well

ixti commented 9 months ago

I'm gonna refactor the way scheduling strategy is stored, and will incorporate this PR as part of v2.0.0 release which I will work on next weeks. Couple of heads up on the API I have in mind:

sidekiq_throttle_push_back(
  with: :enqueue, # Possible options: :enqueue, :requeue, and :schedule
  to: nil, # Can specify queue to push back to, nil, String, or proc
  in: 0 # only applicable for :schedule
)

Under the hood, :requeue will simply call UnitOfWork#requeue and :enqueue as well as :schedule will call client_push. This way the implementation will be BasicFetch/SuperFetch-agnostic.

simonc commented 4 months ago

Hi there 👋

That's a pretty nice work going in there! 😊

I was wondering if it'd also make sense to have a :drop strategy when you want to simply prevent the same job from being pushed twice and not queue/schedule a throttled job.

Thanks! ❤️

woodhull commented 4 months ago

Just checking in on this. Has it been abandoned?

ixti commented 4 months ago

Just checking in on this. Has it been abandoned?

Nothing is abandoned. Just lacking of free time to work on this one :(( But it's on my radar to land first thing once I'll get to it.

mnovelo commented 4 months ago

@ixti, let us know if we can help to move this forward!

gstokkink commented 2 months ago

Looking forward to this PR as well, would solve the enqueued_at not being reset issue that me and a lot of other people are running into.

gstokkink commented 2 months ago

Oh and @ixti I'm happy to help with Sidekiq Pro testing if you require it 😄