-
This is to investigate beyond what can be offered by adding a health-check endpoint #1015 and using tools to collect the existing lots and basic box metrics #1016
The aims of the additional tracin…
-
**Do you want to request a *feature* or report a *bug*?**
bug
**What is the current behavior?**
When there is an error (eg timeout) on firehose, graph-node waits 2 minutes and then tries agai…
-
It would be nice if the shell-operator can call certain hooks when a Prometheus alert happens.
The Prometheus Alertmanager has support for calling [webhooks](https://prometheus.io/docs/alerting/la…
-
### Proposal
I have gotten many feedbacks on use cases that needs large amount of rules (20K+), in order to support efficient displaying of rules in UI like Grafana, I think we need some form of pagi…
-
**Is your feature request related to a problem? Please describe.**
One day I got an alert:
![image](https://user-images.githubusercontent.com/7119703/105685138-5e5d4c00-5ebb-11eb-8b98-c3183f8ea…
-
While the title basically says it all, I will try to back this up using a concrete example. Imagine a setup where in addition to all routers being a target for some type of `blackbox_exporter`-style m…
-
Hi,
I'm looking for a way to notify my team every-time the chaos bot started to perform actions.
As Slack usage is widely used, that will be my preference.
I want to start and implement that ca…
-
Hi!
I'm currently working on an automated setup for ASBs, and while it's great that
it has some neat trace-based logging, to derive dashboards and set up alerts,
one needs to go around parsing th…
-
Please add prometheus metrics so the prometheus can check for outdated containers and alertmanager can handle notifications.
Similar ratinale as the related [watchtower request](https://github.com/co…
-
Is there a clear rationale documented some where for why adding new attributes to a metric is a breaking change?
[This PR comment](https://github.com/open-telemetry/opentelemetry-specification/pull…