WaterByWind / grafana-dashboards

Grafana Dashboards
MIT License
293 stars 53 forks source link

sysUpTime looks incorrct. #27

Open MaurUppi opened 10 months ago

MaurUppi commented 10 months ago

I'm using EdgeMAX EdgeRouter 4 v2.0.9-hotfix.2

Here is the uptime showing on the device. image

Your Grafana's Dashborad Uptime section query result is 1882461 by below SQL inquiry.

`SELECT "sysUpTime"  / 100 FROM "snmp.EdgeOS" WHERE ("agent_host" =~ /^$host$/) AND $timeFilter

image

So, which part is going wrong to caused this issue?

telegraf related config

  [[inputs.snmp.field]]
     name = "sysUpTime"
     oid = "HOST-RESOURCES-MIB::hrSystemUptime.0"

The HOST-RESOURCES-MIB.txt file in telegraf docker was copied from EdgeRouter /usr/share/snmp/mibs folder

hrSystemUptime OBJECT-TYPE
    SYNTAX     TimeTicks
    MAX-ACCESS read-only
    STATUS     current
    DESCRIPTION
        "The amount of time since this host was last
        initialized.  Note that this is different from
        sysUpTime in the SNMPv2-MIB [RFC1907] because
        sysUpTime is the uptime of the network management
        portion of the system."
    ::= { hrSystem 1 }

I also tried inquiry SNMP OID

below OID comes from https://github.com/grafana/jsonnet-libs/tree/master/ubnt-edgerouter-mixin

https://github.com/grafana/jsonnet-libs/blob/02db06f540086fa3f67d487bd01e1b314853fb8f/ubnt-edgerouter-mixin/snmp_generator/snmp.yml#L1036C25-L1036C25

  - name: hrSystemUptime
    oid: 1.3.6.1.2.1.25.1.1
    type: gauge
    help: The amount of time since this host was last initialized - 1.3.6.1.2.1.25.1.1

OID 1.3.6.1.2.1.25.1.1 corresponds to HOST-RESOURCES-MIB::hrSystemUptime. hrSystemUptime is similar to sysUpTime in the SNMP MIB-II, but it's part of the Host Resources MIB. It represents the amount of time since the host was last initialized.

telegraf@abf4573b2ca9:/$ snmpget -v2c -c ouzycn 192.168.1.253 1.3.6.1.2.1.25.1.1.0
iso.3.6.1.2.1.25.1.1.0 = Timeticks: (188642200) 21 days, 20:00:22.00

telegraf@abf4573b2ca9:/$ snmpget -v2c -c ouzycn 192.168.1.253 HOST-RESOURCES-MIB::hrSystemUptime.0
HOST-RESOURCES-MIB::hrSystemUptime.0 = Timeticks: (188642537) 21 days, 20:00:25.37

  - name: sysUpTime
    oid: 1.3.6.1.2.1.1.3
    type: gauge
    help: The time (in hundredths of a second) since the network management portion
      of the system was last re-initialized. - 1.3.6.1.2.1.1.3

iso.3.6.1.2.1.1.3.0 is the full OID for the sysUpTime object, representing the time since the last system start-up.

telegraf@abf4573b2ca9:/$ snmpget -v2c -c ouzycn 192.168.1.253 1.3.6.1.2.1.1.3.0
iso.3.6.1.2.1.1.3.0 = Timeticks: (6095747) 16:55:57.47

According to the below info, I think what you were using hrSystemUptime is correct.

The difference between the OIDs 1.3.6.1.2.1.25.1.1 and 1.3.6.1.2.1.1.3.0 in SNMP lies in their origin and specific use cases:

  1. OID 1.3.6.1.2.1.1.3.0 (sysUpTime):
  1. OID 1.3.6.1.2.1.25.1.1 (hrSystemUptime):

In summary, while both OIDs report uptime, sysUpTime is specific to the network management subsystem and can reset independently of the host system, whereas hrSystemUptime is specific to the host system's uptime and resets only when the entire system restarts. This distinction is important when monitoring the uptime for network devices vs. the actual hosts/servers they are part of.