kentik / snmp-profiles

SNMP Profiles for ktranslate
Apache License 2.0
24 stars 66 forks source link

SysOID for generic Linux server is matched to physical Dell server #36

Closed dgcom closed 2 years ago

dgcom commented 3 years ago

Any generic Linux server with netSNMP agent will return this SysOID: https://oidref.com/1.3.6.1.4.1.8072.3.2.10 This is currently matching dell-poweredge.yml which describes physical Dell server which is not correct - this can be VM, or any generic hardware. This profile should really pull info from generic MIBs which would exist on default plain vanilla Linux install. And the other question is - how this will be represented in NewRelic? Ideally, this should be able to pull most info needed to represent real infrastructure host. Currently, this fails to pull any useful counter into NewRelic at all.

Mesverrum commented 3 years ago

Just my thoughts, I'm not the decision maker on this. I think what we need to do is drop the dell-poweredge profile from the repo and redirect people using that to the New Relic infrastructure agent. If they want to get the hardware info they should get it from the idrac. SNMP is not our recommended strategy for server monitoring in the big picture.

Alternatively, for people who insist on using snmp to monitor servers we would convert that poweredge profile to a generic physical linux server profile and make it the place to include all the server brands of hardware health sensors. These oids only exist on servers where the HP IM/Dell OM/etc agents are running so it is pretty situational.
I was told that @i3149 had built in some logic for ktranslate to not continue polling for unresponsive oids so it shouldn't be too noisy to include them all together, the extra ones would just drop.

As far as how it is represented in NR, at this point it looks like there is no entity synthesis definition for a physical server. If you didn't know already the provider field in the profile gets mapped to an entity definition in the NR repo. https://github.com/newrelic/entity-definitions I do see ext-host but at this time it is just set up for forwarding data dog agent info to NR. Something could be created, or that one updated to cover both data sources, but I don't think we really want to encourage this use pattern. The infra agent is just a hundred times more powerful than net-snmp.

dgcom commented 3 years ago

Yes, I think a generic Linux profile is a way to go. Installing infrastructure agent is better, but I should have added a valid use case when it is not possible - vendor-provided Linux based appliances, often running on custom hardware or as a VM. In most cases they won't allow installing any 3-rd party software and refer customers to SNMP for monitoring.

And thanks for pointing me to entity definitions, quite useful. I'd hope that definition for generic Linux host can be created for such use cases.

thezackm commented 2 years ago

closed with PR #72

dgcom commented 2 years ago

Interesting... I was actually about to post the profile I have used for generic Linux host with fully configured NetSMNP agent - see below. I am still not sure if kentik-net-snmp will be recognized as host in NewRelic (it probably should).


# Generic profile for Linux running NetSNMP agent
---
extends:
  - system-mib.yml
  - if-mib.yml
  - ip-mib.yml
  - tcp-mib.yml
  - udp-mib.yml
  - ucd-mib.yml
  - host-resources-mib.yml

provider: kentik-linux-netsnmp

# disable the SNMP bulk walk operation for these devices
no_use_bulkwalkall: true

# SNMPv2-MIB::sysObjectID.0 = OID: NET-SNMP-MIB::netSnmpAgentOIDs.10
sysobjectid:
  - 1.3.6.1.4.1.8072.3.2.10

metrics:
# system mib
# There are more SNMP metrics available
  - MIB: SNMPv2-MIB
    symbol:
      OID: 1.3.6.1.2.1.11.29.0
      name: snmpOutTraps
# IF-MIB - good coverage
# IP-MIB - captures tables, may need some additional scalars, ex. ipOutNoRoutes
# TCP-MIB - connection table can be useful
# UDP-MIB - udpHCInDatagrams/udpHCOutDatagrams - may need to be replaced with udpInDatagrams/udpOutDatagrams
# UCD-SNMP-MIB and UCD-DISKIO-MIB - good coverage
# HOST-RESOURCES-MIB - possible add: hrMemorySize