2i2c-org / features

Temporary location for feature requests sent to 2i2c
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Explore using billing alerting infrastructure directly from the cloud provider #13

Open choldgraf opened 2 years ago

choldgraf commented 2 years ago

Context

In a recent incident there was some cloud infrastructure running in the background that we did not track with our Grafana dashboards (because it was on an old cluster).

We have an issue to track using Grafana for cloud provider alerting (https://github.com/2i2c-org/infrastructure/issues/1288). However, this would not have caught this problem because it was outside of Grafana's scope.

Each cloud provider also tends to provide their own cloud billing monitoring and alerting infrastructure. For example, you can trigger emails or warnings at certain spend levels, and you can even automatically trigger some actions like cluster shutdown.

For example:

We may also be able to automate this process. For example:

One of the biggest concerns that researchers have with cloud is the "hidden and ballooning costs" problem, so we need to do whatever we can to reduce this uncertainty for others.

Proposal

For each of the clusters that we deploy, we should also use the cloud provider's cost management and alerting system, in order to warn us when unexpected amounts of spending occur. We can define the specific rules in collaboration with Community Representatives, but they could be something like:

Updates and actions

No response