riverqueue / river

Fast and reliable background jobs in Go
https://riverqueue.com
Mozilla Public License 2.0
3.34k stars 89 forks source link

[FEATURE REQUEST] Jobs aggregation #550

Open krhubert opened 3 weeks ago

krhubert commented 3 weeks ago

I wonder if this feature is on your roadmap.

TL;DR; Job aggregation allows you to enqueue multiple jobs successively, and have them passed to the Worker together rather than individually. The feature allows you to batch multiple successive operations into one.

You can see the usecase and the original PR: https://github.com/hibiken/asynq/issues/339

Final docs https://github.com/hibiken/asynq/wiki/Task-aggregation

nexovec commented 3 weeks ago

Might be a duplicate of #453

brandur commented 3 weeks ago

This one's WIP and coming very soon.

krhubert commented 3 weeks ago

Let me close this one since it's a duplicate.

bgentry commented 3 weeks ago

@krhubert actually I wonder if you might be able to provide some more context on how you’re looking to use job aggregation / grouping / batching? I’ve read your linked issue on Asynq, not sure if that reflects your current use case. Mostly curious what timeframe you’re looking to batch over, whether and how you’re wanting to partition your batches, etc.

krhubert commented 3 weeks ago

@bgentry I'm unsure if I can share details in the public discussion, but I can give more specific overview in the private conversation. I can hop on a call to talk or join slack/discord if you have some. Let me know if that is something you are interested in.

Mostly curious what timeframe you’re looking to batch over,

My usecase is very different from what I shared in the issue, although the aggregate feature described in the issue solves the problem.

The timeframe I use right now:

  1. Grace Period - 5 min
  2. Max Delay - 15 min

GroupGracePeriod: The grace period is renewed whenever a task with the same group key is added to the group GroupMaxDelay: The grace period has a configurable upper bound, user can optionally set maximum delay, after which Asynq will deliver the tasks to Handler regardless of the remaining grace period

whether and how you’re wanting to partition your batches, etc.

I don't partition them - If we think about the same partitioning - because to me it can happen before adding a job, and after aggregation.

I know this context might not give you full visibility.

bgentry commented 3 weeks ago

whenever a task with the same group key is added to the group

I think this maybe hints at what I was asking about partitioning. Essentially I'm trying to understand what criteria you want to use to group your jobs, whether it's merely about grouping any jobs of the same kind, or if you want to group on some other attrs.

I also sent you an email to try to connect another way about details you don't wish to post on GitHub. Cheers :v: