apache / incubator-streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.
https://streampark.apache.org/
Apache License 2.0
3.84k stars 990 forks source link

[Feature] Solve alarm current limiting #2142

Open xujiangfeng001 opened 1 year ago

xujiangfeng001 commented 1 year ago

Search before asking

Description

In the actual production environment, streampark may have current restriction when using software alarms such as flash book and nail, which may result in some alarm messages not being sent. In my opinion, alarms should not be sent because they are an important indicator of task status.I want to solve the alarm problem by adding a blocking queue. When the task fails, the alarm is not sent at the first time, but added to the blocking queue,Then send the alarm through a separate alarm thread. Of course, this change may affect the effectiveness of the alarm. After my test, this change can solve the problem of alarm current limiting.

Usage Scenario

Task alarm

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

MonsterChenzhuo commented 1 year ago

+1 good suggestion. I have some suggestions, the general standard practice in the industry is that there will be some parameters that can be set, such as specifying the maximum number of sends in 30 minutes, as a way to avoid unlimited alarms, I think these flow limiting measures should be optional for the user, rather than the system default

MonsterChenzhuo commented 1 year ago
图片 图片

The picture above, is a very powerful software in the industry to do alarm, I think these can give you some reference.

MonsterChenzhuo commented 1 year ago

I think a core point is that we should provide the user with some rules for alerting, not write dead by default

xujiangfeng001 commented 1 year ago

Thank you for your suggestion. I think adding alarm rules is a good proposal. I also heard some requirements for alarm rules, and I will think about it again and improve this idea.

ziqiang-wang commented 1 year ago

I'll add a little feature: When getting an alarm waiting to be sent in the blocking queue, get the number of elements in the queue. If a certain value is checked, send an alarm related to the alarm backlog first, telling the user that there is an alarm backlog and the number of the backlog.

xujiangfeng001 commented 1 year ago

I'll add a little feature: When getting an alarm waiting to be sent in the blocking queue, get the number of elements in the queue. If a certain value is checked, send an alarm related to the alarm backlog first, telling the user that there is an alarm backlog and the number of the backlog.

Thank you for your proposal. I'll take it into consideration.