Open MorningLight5 opened 2 years ago
I'm prefer to add syntax LIMITER
on behalf of the frequency limiter. This limiter is an abstract concept, it can have many kinds of types like query or stream load.
So the SQL to monipulate LIMITER
is like below:
CREATE LIMITER name PORPERTIES("key1"="value1", "key2"="value2");
DROP LIMITER name;
SHOW LIMITER;
What if the request exceed the limit? return error or slow down? And is there any other system we can refer to?
What if the request exceed the limit? return error or slow down? And is there any other system we can refer to?
As far as I know, MySQL have variable max_connection
to limit connection number. When connection exceed the limit, it returns error.
I think we can put LIMITER
relative config in FE, and put metric data in BE. The procedure is like below:
I see.
Doris already has max_connection
limit which can be set for each user.
But I think what you need is not just limit the number of connection, but to limit the rate
of request.
As far as I know, Guava's rate limiter may meet the requirement. But what more important is, how to define the rate
?
Simply put, it may be a limitation of QPS. But the essence is "control the consumption of cluster resources per unit time."
So I think in the first version, we can implement this function through simple rules (such as QPS). But in the specific design, we must reflect the abstract design of "system resources" so that we can add more rules later.
Looking forward your PR!
I see. Doris already has
max_connection
limit which can be set for each user. But I think what you need is not just limit the number of connection, but to limit therate
of request.As far as I know, Guava's rate limiter may meet the requirement. But what more important is, how to define the
rate
? Simply put, it may be a limitation of QPS. But the essence is "control the consumption of cluster resources per unit time."So I think in the first version, we can implement this function through simple rules (such as QPS). But in the specific design, we must reflect the abstract design of "system resources" so that we can add more rules later.
Looking forward your PR!
Where do you think the limiter should be put, BE or FE? As Guava is for Java, Do you think the limiter is better in FE?
What if the request exceed the limit? return error or slow down? And is there any other system we can refer to?
Impala’s AdmissionController does a similar thing, Introduction is here https://shimo.im/docs/6qxjctpyDHJgPwtw
The limit of operation frequency is developed in #7474 , user can config the threshold through frontend config like below:
ADMIN SET FRONTEND CONFIG ('key' = 'value')
You can limit the query number(max_running_query_num
) in certain period (report_stats_period
), the default period is 10 second(10 * 1000). And you can also limit the load number through max_running_txn_num
.
The design of this feature is clear: Each FE keeps its query number locally, and reports the query number to Master every period, So every FE can get the query number in each FE through metadata synchronize. When there is a query arrive, if the total query number in last period exceeds threshold, the system reject the query. User can only query in next period.
Search before asking
Description
In productive environment, the Doris cluster is often facing pressure from many aspects (mainly from stream load and query), cause many resource shortage problem like OOM, especially in shared cluster. As above picture shows, the memory usage waves too big. I think it's better to have a way to limit the resource usage of each user. Maybe limit the usage frequency is a proper way.
Use case
No response
Related issues
No response
Are you willing to submit PR?
Code of Conduct