Closed slopezz closed 2 years ago
I tried to do the workaround by deleting directly the role label from every metric, so that way, to know the role of every instance we could have an specific metric with that information.
However it did not work, upon a failover with a role change, it only worked OK for metrics that are always available for all instances, independtly of the role (so always total of 6 timeseiresdb because there are 6 instances) , like:
While metrics:
So:
So we need to reset metrics upon a failover with a role change, so only report true metrics, not obsolote metrics.
While working on sentinel grafana dashboard https://github.com/3scale-ops/saas-operator/pull/197 I discovered a couple of bugs:
Bug 2: standard metrics being reported ad infinitum where a role changes
All standard metrics are retrieved from sentinel commands.
For every redis-server on a given shard with a given role, there is a given timeseries database (tuple with 3 elements)
However, when there is any change of the role from a given redis-server, it is a created a new timeseriesDB for the new shard--role2--redis-server, however the old shard--role1--redis-server timeseriesDB keeps being reported with latest value, although this tuple does not exist anymore.
Example
This redis server
10.65.6.5
fromshard01
was a master long time ago (in yellow) but it is now a slave (blue)Another example is the role reported time:
And this apply to any metric where the role label is added in https://github.com/3scale-ops/saas-operator/blob/8aafa688780f86dcb013bf8bf7fe884e1bf44d43/pkg/redis/metrics/sentinel_metrics.go
Workaround
A workaround would be to remove the role label from every metric, so the timeseries would incldue the tuple of shard-redis-server (instead of shard--role--redis-server), however I think being sentinel metrics it make sense to always have this role label.
Ideal solution
IMO, the ideal solution would be to stop reporting metrics whose tuple shard--role--redis-server in not active anymore.
master
toslave
role, what we would should see is:master
and now is aslave