Open lavaturtle opened 1 year ago
That is awesome! Thank you! I will take a look this weekend.
Re: Style/RedundantCurrentDirectoryInPath
I prefer consistency. It is pretty weird to me that rubocop's default for rescue
is to be explicit, while for require_relative
it's implicit. Let's just disable this cop. I will find time to open a rubocop PR to make that configurable. But for now, I would like it to be simply disabled.
On a side note about this PR, I think I made a bad decision back when I started this gem on providing custom sidekiq_throttle
singleton class message, probably we should utilize sidekiq_class_attribute
helper instead:
def self.included(base)
base.sidekiq_class_attribute :sidekiq_throttle_push_back # :enqueue | :schedule
end
In this case we will not need to bother about "keeping track" of job class inheritance.
Re: Style/RedundantCurrentDirectoryInPath
I prefer consistency. It is pretty weird to me that rubocop's default for
rescue
is to be explicit, while forrequire_relative
it's implicit. Let's just disable this cop. I will find time to open a rubocop PR to make that configurable. But for now, I would like it to be simply disabled.
Makes sense! I've updated the PR to disable the cop instead of applying it.
On a side note about this PR, I think I made a bad decision back when I started this gem on providing custom
sidekiq_throttle
singleton class message, probably we should utilizesidekiq_class_attribute
helper instead:
Ah, interesting! I didn't know about sidekiq_class_attribute
. So if we do that, then a job class can specify sidekiq_throttle_push_back
in its sidekiq_options
, and then we can read it later with get_sidekiq_options
?
Would you prefer push_back
as the setting name rather than requeue_strategy
?
sidekiq_class_attribute :sidekiq_throttle_push_back
will define a new singleton class method, so the usage will look like:
class MyJob
include Sidekiq::Job
include Sidekiq::Throttled::Job
sidekiq_throttled_push_back :enqueue
end
Naturally, we should check if the value is Proc, and if so - we should call that proc to get the value, so that it will be possible to:
class MyJob
include Sidekiq::Job
include Sidekiq::Throttled::Job
sidekiq_throttled_push_back ->(user_id, *) { user_id.odd? ? :enqueue : :schedule }
def perform(user_id, more, stuff)
# ...
end
end
I have no strong feelings about push_back
vs requeue_strategy
. I just don't like over-use of strategy in names (easy to confuse with Throttled::Strategy). Probably simply: sidekiq_throttled_requeue_with
?
Also, just realized, that it would be nice to also allow specifying which queue it should be requeued to.
Hmm, I've been working on getting the sidekiq_class_attribute
approach to work, and as far as I can tell from my testing, if we do this in Sidekiq::Throttled::Job
:
def self.included(worker)
worker.send(:extend, ClassMethods)
worker.sidekiq_class_attribute :sidekiq_throttled_requeue_with # :enqueue | :schedule
end
then the job class needs to use it like this:
class MyThrottledJob
include Sidekiq::Job
include Sidekiq::Throttled::Job
self.sidekiq_throttled_requeue_with = :schedule
sidekiq_throttle foo: :bar
i.e. it ends up as a class-level variable that needs to be assigned, rather than being something we can use like sidekiq_throttled_requeue_with :schedule
.
So maybe we invoke sidekiq_class_attribute
for the inheritance benefits, but instead of users using sidekiq_throttled_requeue_with in the class directly, we keep the requeue_with
option for sidekiq_throttle
? That also avoids any weirdness around which order the two are called in.
Okay! I've made a number of changes. The basic usage should now look like this:
sidekiq_throttle threshold: {limit: 123, period: 1.hour}, requeue: {to: :other_queue, with: :schedule}
with support for Procs:
class MyJob
include Sidekiq::Job
include Sidekiq::Throttled::Job
sidekiq_throttle threshold: {limit: 123, period: 1.hour},
requeue: {to: ->(user_id, *) { user_id.odd? ? :odd_queue, :even_queue },
with: ->(_user_id, more, *) { more > 25 ? :schedule : :enqueue }}
def perform(user_id, more, stuff)
# ...
end
end
and the default configuration looks like this:
Sidekiq::Throttled.configuration.default_requeue_options = {with: :schedule, to: :my_throttled_jobs_queue}
Both :with
and :to
arguments are optional. If :with
is not specified, we'll use the :enqueue
method. If :to
is not specified, we'll use the queue the job was originally enqueued into.
Hey @ixti , Thanks for your amazing work on this gem. Sorry for asking here, just don't want to create new threads, Do you have any estimate on release of 1.0.0 stable?
Excited to land this PR both for the app I help maintain with @lavaturtle and another product I'm working on. Anything else we can do to help move this and v1 across the finish line? Happy to put some more of our engineering time in on our side if there are discrete tasks you can point us at. We're keen to invest in the gem & code quality.
@lavaturtle @woodhull thanks for the amazing PR. I will try to go through it today-tomorrow. Sorry for being unresponsive (was really overwhelmed with some work lately - but slowly getting back at dedicating time to sidekiq-throttled).
The only issue I'm scratching my head about right now is the Sidekiq-Pro support. Can't figure out how to make sure it works correctly. Thus thinking on skipping it's support for 1.0.0 release completely and circle back at that task after 1.0.0 release.
@ixti I'm happy to help with the Sidekiq Pro support, regardless of whether it's for 1.0.0 or later! Let me know if there's a branch you'd like me to test and/or review
Hey.... Just looping back to this.
We're not using Sidekiq Pro so we aren't sure how to help move this forward on that front. That code is private for paid Sidekiq customers I think?
@woodhull yeah, the Sidekiq Pro code is private for paid customers.
@ixti I'm still happy to help with the Sidekiq Pro support, especially if it's blocking the 1.0.0 release. At the same time, I fully support releasing 1.0.0 without support for Sidekiq Pro's super_fetch. It's a configuration option that we're not using yet. We only use batching from Sidekiq Pro ourselves right now.
Is this going to be released? I love this strategy, we're having a severly throttled queue, and it is degrading our other queues and jobs, would love this strategy!
@JamesWatling yes. Working to merge this in the next couple of days.
This PR will help to mitigate #86 as well
I'm gonna refactor the way scheduling strategy is stored, and will incorporate this PR as part of v2.0.0 release which I will work on next weeks. Couple of heads up on the API I have in mind:
sidekiq_throttle_push_back(
with: :enqueue, # Possible options: :enqueue, :requeue, and :schedule
to: nil, # Can specify queue to push back to, nil, String, or proc
in: 0 # only applicable for :schedule
)
Under the hood, :requeue
will simply call UnitOfWork#requeue
and :enqueue
as well as :schedule
will call client_push
. This way the implementation will be BasicFetch/SuperFetch-agnostic.
Hi there 👋
That's a pretty nice work going in there! 😊
I was wondering if it'd also make sense to have a :drop
strategy when you want to simply prevent the same job from being pushed twice and not queue/schedule a throttled job.
Thanks! ❤️
Just checking in on this. Has it been abandoned?
Just checking in on this. Has it been abandoned?
Nothing is abandoned. Just lacking of free time to work on this one :(( But it's on my radar to land first thing once I'll get to it.
@ixti, let us know if we can help to move this forward!
Looking forward to this PR as well, would solve the enqueued_at
not being reset issue that me and a lot of other people are running into.
Oh and @ixti I'm happy to help with Sidekiq Pro testing if you require it 😄
Hi @ixti I work with @lavaturtle (who initially submitted this PR).
We're considering working on this PR to update the code to resolve the conflicts and work with the latest version on main
. But given the effort of updating the code due to major changes on main
since we first submitted the PR, we wanted to confirm that this feature is still relevant for you and that you would be interested on merging. Otherwise we may look for other options to avoid maintaining a fork of this library with the feature we need.
@anero will be more than happy to merge it. Due to lack of time failed to do any refactoring, so will be happy to merge it either way you will come with it - and (probably) do follow up refactoring after it will be merged LOL
@lavaturtle @anero, happy to help test with Sidekiq Pro too
Attention: Patch coverage is 96.05263%
with 3 lines
in your changes missing coverage. Please review.
Project coverage is 98.47%. Comparing base (
bca3912
) to head (11c6d98
).
Files with missing lines | Patch % | Lines |
---|---|---|
lib/sidekiq/throttled/strategy.rb | 91.89% | 3 Missing :warning: |
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
the Sidekiq Pro specs pass too 🎉
Great, thanks for testing @mnovelo!
I was concerned that there may be a problem with how we requeue the job here https://github.com/ixti/sidekiq-throttled/pull/150/files#diff-c15f45918c6d7d825fbfa0856557360b907b492e44879ba9eb1e7f3bad9c42baR89 which is different from how it was previously done (https://github.com/ixti/sidekiq-throttled/pull/150/files#diff-ae016bb4af61725f74c58dac11f63bd38a08d6c5c5519e467902911a51077b0fL27), but I don't know enough about Sidekiq's API to say if it's an actual problem
oh, that is a problem actually @anero I missed that change. Sidekiq Pro's super_fetch does do important additional logic when requeuing. I can work on a PR to address this, but I would like it if we kept Throttled.requeue_throttled(work)
or similar so that super_fetch can run its additional logic
oh, that is a problem actually @anero I missed that change. Sidekiq Pro's super_fetch does do important additional logic when requeuing. I can work on a PR to address this, but I would like it if we kept Throttled.requeue_throttled(work) or similar so that super_fetch can run its additional logic
@mnovelo I wasn't sure if there's a way to reference the correct retriever (BasicFetch
/ SuperFetch
) from Sidekiq::Throttled::Strategy#requeue_throttled
.
I guess we could perform a similar verification as it's done elsewhere to decide whether Sidekiq Pro is available or not, and requeue using the correct mechanism. Do you've any thoughts on this @ixti ?
@anero oh, the class of the work
variable will be Sidekiq::Pro::SuperFetch::UnitOfWork
, so that's one way to distinguish whether the super_fetch strategy should be used. That was already being assumed in the previous implementation because it just delegated the requeuing logic to Sidekiq::Pro::SuperFetch::UnitOfWork#requeue
so something like
case work
when Sidekiq::Pro::SuperFetch::UnitOfWork
# super_fetch stuff
else
# basic_fetch stuff
end
could work. Though I haven't dug deep into this yet
I think, the most clean solution will be to patch UnitOfWork similar way we patch BasicFetch and SuperFetch. What @mnovelo proposed will work even better, though. :D
ok, I'll work on a PR into this one then @anero @ixti !
@anero @ixti here's my proposal https://github.com/controlshift/sidekiq-throttled/pull/2
If this looks good, I can write specs for it too
Thanks for the PR @mnovelo, sorry I merged before seeing your comment on the specs. Want to add those separately? (I'd do it myself but I don't have Sidekiq Pro to be able to run them)
@mnovelo I fixed the failing specs. Would you mind adding test coverage for the new code on Sidekiq::Throttled::Strategy
to raise code coverage and get the build passing?
@anero Specs are ready for your review! https://github.com/controlshift/sidekiq-throttled/pull/3
This is looking good! However, I want to highlight something important about the new scheduling strategy for future reference. It is not fully compatible with Sidekiq::Batch
because when a job gets rescheduled using this strategy, it is assigned a new job ID. If a job within a batch gets throttled with the new scheduling strategy, the original job will be marked as complete. This could lead the batch to assume that the job has been completed as well. I don't think the new job retains the batch ID, and I didn't make any effort to ensure this or write any specs to test this. In my opinion, there isn't a strong use case for using both the new scheduling strategy and Sidekiq::Batch
together. However, I'm happy to help revisit that in the future should the need arise.
This adds support for setting the
requeue_strategy
for a job, to specify what we should do with throttled jobs. The default is:enqueue
, which is the current behavior: re-add it to the end of the queue. The other option is:schedule
, which schedules the job for a time in the future when we think we'll have capacity to process it.It's also possible to set the
default_requeue_strategy
in the configuration, to set this behavior for all jobs that do not individually specify arequeue_strategy
.This may be relevant to Issue #36.
Unrelatedly, there's a commit in here that changes the format of some
require_relative
lines, to comply with the new Rubocop ruleStyle/RedundantCurrentDirectoryInPath
. I don't feel strongly about this commit; I only added it so that rubocop would pass.