canonical / grafana-agent-operator

This charmed operator automates the operational procedures of running Grafana Agent, an open-soruce telemetry collector.
https://charmhub.io/grafana-agent
Apache License 2.0
4 stars 8 forks source link

Refine HostInterfaceMTUSize alert rule #107

Open facundofc opened 1 month ago

facundofc commented 1 month ago

Enhancement Proposal

The charm ships an alert rule (HostInterfaceMTUSize) which looks for changes in network interfaces. The rule does not filter out any interface, meaning that it will trigger for changes in br-int's MTU. More details about the br-int interface can be read on this issue's comment, but the bottom line is that its MTU is irrelevant. Users creating networks of different MTUs will cause the iface's MTU to change and this does not indicate any issue.

I can think only two ways of solving this, but more ideas are welcome of course.

  1. Add a configuration property to the charm to specify a set of interfaces to be ignored by the rule.
  2. Make the charm automatically detect which interfaces are relevant to monitor. This could be determined by reading the output of ip -j -d link and looking for a certain set of properties (like .linkinfo.info_kind == "openvswitch" and the like).

I see issues with both approaches, namely:

  1. This is the path to growing the list of config properties for this charm to be potentially pretty long, as this is a very general charm ("monitor a host").
  2. I'm not sure how often this could happen, but the list of interfaces to be ignored could change. This brings the need to detect that and re-render the rule.

Personally, I would go for alternative 1. Likely there are more interfaces we want to ignore, like TAP interfaces created for the VMs and so on.

Our immediate workaround is to add a very long silence matching device != "br-int", but this is far from ideal.