apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.37k stars 14.1k forks source link

Impose rate limits for task starts as pool feature #15082

Open BeatlesMD opened 3 years ago

BeatlesMD commented 3 years ago

Description

As a pool feature, queue task starts such that task initiation is distributed over time according to a sliding window rate limit (may be more easily implemented as task initiation cool down within a pool).

Use case / motivation

APIs commonly will impose certain techniques to limit rate of requests (sliding window, fixed window, token bucket, leaky bucket, etc). While task retries may resolve the issue, all failures could potentially be avoided if there was a feature to match the endpoint's programatic behavior (I suggest sliding window as it has other benefits).

A sliding window rate limiter could also be used to stagger task/request initiation to a legacy system. There are a number of reasons why you may want to stagger requests to a legacy system, such as if the beginning portion of a request is the most resource intensive within the foreign system, or if the legacy system itself does not itself provide its own rate-limiting signals.

It may be more easily implemented as a cool down between a pooled task initiation and the next queued task start, but figured I'd frame the feature request to match other rate-limiting strategies and techniques commonly seen.

Related Issues

Potentially https://github.com/apache/airflow/issues/8789 ?

boring-cyborg[bot] commented 3 years ago

Thanks for opening your first issue here! Be sure to follow the issue template!

rubenbriones commented 1 year ago

That would be really nice. I have multiple ETLs that scrape data from Free API that have a max rate limit requests by minute, and I haven't figure a way to implement this rate limit logic in Airflow.

I'm using Dynamic Task Mapping for generating 1000 tasks (to scrape 1000 different items data, from the same API), but i want to schedule them in a way that the rate limit is not surpassed.

Maybe do you think it is possible to implement this with a custom Deferred HttpOperator, that before sending the HTTP requests checks (via XCom) how many requests have been sent over the last X minutes to the http_conn_id requested? That new custom operator should update the XCom cache after making new requests. ¿That makes sense?

potiuk commented 1 year ago

The problem (and difficulty) with implementing this one is that you need some central service that would coordinate the rate limits across mutliple parallel running tasks from - potentially - multiple nodes running such client code.

Implementing a time-sliding window with request rate by minute (or another time period is not something that could be easily done in "generic" way.

There are some projects that implement some generic "services" that can provide such capabilities (Global Distributed Client Side Rate Limiting) - for example https://github.com/youtube/doorman - and you might see there complexities involved in implementing such a solution.

potiuk commented 1 year ago

But likely yeah, you can implement a "poor-man's version" described as you mentioned (but expect some problems - for example potentia starvation of some tasks) that you will have to deal with. Those aren't easy things to implement.