nilenso / goose

A powerful background job processing library for Clojure
https://github.com/nilenso/goose/wiki
MIT License
265 stars 11 forks source link

Throttle Jobs (aka rate-limiting) #103

Open olttwa opened 1 year ago

olttwa commented 1 year ago

Difference between Throttle and Rate-Limit

Both Throttling and Rate-Limiting are designed to limit count of processes at a given time. However, Rate-Limiting rejects the processes exceeding a limit and Throttling queues/pauses processes until current ones have completed.

Rate limiting protects a system by applying a hard limit on its access. Throttling shapes a system by smoothing spikes in traffic.

A background processor shouldn't reject exceeding tasks queued. That's best handled at load-balancer layer. For these reasons, in Goose, we'll use the term Throttling, and not Rate-Limiting.

Why the need to Throttle Jobs?

Often 3rd party APIs will enforce a rate limit. Ergo, the count of Jobs executing at a given time shouldn't exceed this limit.

Patterns of Throttling

As elaborated here, Throttling can be done in 4 ways:

  1. Concurrent: Only N jobs can execute at a given time.
  2. Token Bucket: Like Concurrent, but resource pool is limited to N and grows at a fixed rate, which might be higher/lower than Job completion rate.
  3. Leaky Bucket: Like Token Bucket, but allows bursts of jobs in a small time-interval. Resource pool can stay fixed like Concurrent or increase at a fixed rate like Token Bucket.
  4. Fixed Window: In a given time-frame, only N jobs can execute.
  5. Sliding Window: Like Fixed Window, but with a rolling window of time that moves the cursor to next executed job in a time-frame.

Nuances of Throttling for a background processor

Make note of these things when implementing this feature:

  1. Since executing Jobs will acquire a lock, have a lock_timeout to ensure crashed processes do not hold a lock forever
  2. Have a wait_timeout to ensure workers aren't waiting forever to acquire a lock. Upon timing out, User can configure to publish a metric, raise an alert or discard a Job altogether

Implementation Details

This is a complex feature to build. Some ideas after initial investigation:

  1. A persistent store will be required to store count of executing jobs. Hence, this feature can exist for a message-broker like Redis and Postgres, but not for RabbitMQ.
  2. If the message-broker has built-in support for expiry, that'll be helpful. Else, a separate thread will have to do garbage collection.
olttwa commented 1 year ago

Until Goose supports Throttling, there are 2 hacks that can help achieve that:

  1. If you want to enqueue Jobs asynchronously, Throttling can be achieved using a combination of :threads worker config and count of worker instances
    1. For example, setting :threads count to 5 and running 4 worker instances, you can achieve a Throttle of max 20 Jobs executing concurrently
  2. While enqueuing, you can schedule Jobs with a fixed or staggered delay.
olttwa commented 1 year ago

cc @rickerbh