acassen / keepalived

Keepalived
https://www.keepalived.org
GNU General Public License v2.0
3.95k stars 736 forks source link

SNMP indication of last state change #1938

Closed candlerb closed 3 years ago

candlerb commented 3 years ago

Is your feature request to resolve a problem or provide enhanced functionality? Please describe. I would like a way to reliably detect short-lived changes between master and backup, when polling at reasonable intervals (e.g. 1 minute)

Describe the solution you would like An SNMP gauge which shows the sysUpTime when the last vrrpInstanceState change occurred (similar to ifLastChange in the IF-MIB)

Describe alternatives you have considered An SNMP counter which counts state changes would also achieve this goal (similar to ifInErrors). That would also provide information about the rate of state changes, which could be interesting.

I could write a script which touches a file on state changes, and then monitor the timestamp of that file. That seems messy.

Would the feature request be of benefit only to you, or is it more generally applicable? I think that detecting short-lived master/backup failovers is generally useful, as these may indicate some underlying transient network issue which needs investigating.

Keepalived version Keepalived v2.0.19 (10/19,2019) from Ubuntu 20.04

Additional context I am monitoring keepalived via prometheus snmp_exporter, so having native prometheus support with this feature would also work for me (#1731)

pqarmitage commented 3 years ago

The RFC MIBs provide vrrpStatsBecomeMaster (RFC2787 for VRRPv2) and vrrpv3StatisticsMasterTransitions (RFC 6527 for VRRPv3) provide counters of the number of transitions to master state. These would appear to meet your needs.

candlerb commented 3 years ago

Indeed they would. I have been using KEEPALIVED-MIB, and I was unaware that keepalived supports VRRP-MIB, but indeed it does:

# snmpbulkwalk x.x.x.x VRRP-MIB::vrrpRouterStatsTable
VRRP-MIB::vrrpStatsBecomeMaster.2.35 = Counter32: 1
VRRP-MIB::vrrpStatsAdvertiseRcvd.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsAdvertiseIntervalErrors.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsAuthFailures.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsIpTtlErrors.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsPriorityZeroPktsRcvd.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsPriorityZeroPktsSent.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsInvalidTypePktsRcvd.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsAddressListErrors.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsInvalidAuthType.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsAuthTypeMismatch.2.35 = Counter32: 0
VRRP-MIB::vrrpStatsPacketLengthErrors.2.35 = Counter32: 0

Thanks!