Closed fffonion closed 3 years ago
The config_hash metrics is useful for catching "DP has inconsistent configs across the cluster for x time".
But this will create a new metrics everytime the config is flipped, so the time series is not continous.
Need to verify if that will cause trouble in alerting. For example I can imagine we have a
count(kong_dataplane_last_seen)
for expected data plane count.
But this will create a new metrics everytime the config is flipped, so the time series is not continous.
I'm not sure I understand. Why would it be so?
But this will create a new metrics everytime the config is flipped, so the time series is not continous.
I'm not sure I understand. Why would it be so?
If we put config_hash
as a lable into metrics, you will expect for example
dataplane_last_seen{node_id="UUID", config_hash="hash1"}
to exist before the flip, and
dataplane_last_seen{node_id="UUID", config_hash="hash2"}
exist after. There'll be two color lines in
the prometheus graph. But they are actually referring to a same dataplane node.
So @wyndigo 's idea is to make the config_hash, which is a md5 hexstring, into its numeric value. Then it's no longer a label and we can still compare difference between DPs.
This PR adds a series of metrics to expose connected Data Plane metrics on Control Plane side.
sample output