Open desultory opened 4 months ago
it could be something related to the device? maybe it does something on 5m intervals that slow things down. I've been watching the network and I don't see anything unusual happening at that time, and mostly see the exporters talking to the device. There's a small firmware revision available for that device, but it has no mention of SNMP changes or anything that could affect this.
Yes, that very much seems like something about the device.
Best you can do is maybe tcpdump/wireshark the scrape to compare send and receive times to look for a response latency problem with the device.
Yes, that very much seems like something about the device.
Best you can do is maybe tcpdump/wireshark the scrape to compare send and receive times to look for a response latency problem with the device.
I agree, but debug info about which oid was retried would be great info here.
I'm using a cut down version of the IF-MIB config, not sure if that is impacting this somehow:
if_mib:
walk:
- 1.3.6.1.2.1.2
- 1.3.6.1.2.1.31.1.1
metrics:
- name: ifNumber
oid: 1.3.6.1.2.1.2.1
type: gauge
help: The number of network interfaces (regardless of their current state) present
on this system. - 1.3.6.1.2.1.2.1
- name: ifIndex
oid: 1.3.6.1.2.1.2.2.1.1
type: gauge
help: A unique value, greater than zero, for each interface - 1.3.6.1.2.1.2.2.1.1
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifType
oid: 1.3.6.1.2.1.2.2.1.3
type: EnumAsInfo
help: The type of interface - 1.3.6.1.2.1.2.2.1.3
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
enum_values:
1: other
2: regular1822
3: hdh1822
4: ddnX25
5: rfc877x25
6: ethernetCsmacd
7: iso88023Csmacd
8: iso88024TokenBus
9: iso88025TokenRing
10: iso88026Man
11: starLan
12: proteon10Mbit
13: proteon80Mbit
14: hyperchannel
15: fddi
16: lapb
17: sdlc
18: ds1
19: e1
20: basicISDN
21: primaryISDN
22: propPointToPointSerial
23: ppp
24: softwareLoopback
25: eon
26: ethernet3Mbit
27: nsip
28: slip
29: ultra
30: ds3
31: sip
32: frameRelay
33: rs232
34: para
35: arcnet
36: arcnetPlus
37: atm
38: miox25
39: sonet
40: x25ple
41: iso88022llc
42: localTalk
43: smdsDxi
44: frameRelayService
45: v35
46: hssi
47: hippi
48: modem
49: aal5
50: sonetPath
51: sonetVT
52: smdsIcip
53: propVirtual
54: propMultiplexor
55: ieee80212
56: fibreChannel
57: hippiInterface
58: frameRelayInterconnect
59: aflane8023
60: aflane8025
61: cctEmul
62: fastEther
63: isdn
64: v11
65: v36
66: g703at64k
67: g703at2mb
68: qllc
69: fastEtherFX
70: channel
71: ieee80211
72: ibm370parChan
73: escon
74: dlsw
75: isdns
76: isdnu
77: lapd
78: ipSwitch
79: rsrb
80: atmLogical
81: ds0
82: ds0Bundle
83: bsc
84: async
85: cnr
86: iso88025Dtr
87: eplrs
88: arap
89: propCnls
90: hostPad
91: termPad
92: frameRelayMPI
93: x213
94: adsl
95: radsl
96: sdsl
97: vdsl
98: iso88025CRFPInt
99: myrinet
100: voiceEM
101: voiceFXO
102: voiceFXS
103: voiceEncap
104: voiceOverIp
105: atmDxi
106: atmFuni
107: atmIma
108: pppMultilinkBundle
109: ipOverCdlc
110: ipOverClaw
111: stackToStack
112: virtualIpAddress
113: mpc
114: ipOverAtm
115: iso88025Fiber
116: tdlc
117: gigabitEthernet
118: hdlc
119: lapf
120: v37
121: x25mlp
122: x25huntGroup
123: transpHdlc
124: interleave
125: fast
126: ip
127: docsCableMaclayer
128: docsCableDownstream
129: docsCableUpstream
130: a12MppSwitch
131: tunnel
132: coffee
133: ces
134: atmSubInterface
135: l2vlan
136: l3ipvlan
137: l3ipxvlan
138: digitalPowerline
139: mediaMailOverIp
140: dtm
141: dcn
142: ipForward
143: msdsl
144: ieee1394
145: if-gsn
146: dvbRccMacLayer
147: dvbRccDownstream
148: dvbRccUpstream
149: atmVirtual
150: mplsTunnel
151: srp
152: voiceOverAtm
153: voiceOverFrameRelay
154: idsl
155: compositeLink
156: ss7SigLink
157: propWirelessP2P
158: frForward
159: rfc1483
160: usb
161: ieee8023adLag
162: bgppolicyaccounting
163: frf16MfrBundle
164: h323Gatekeeper
165: h323Proxy
166: mpls
167: mfSigLink
168: hdsl2
169: shdsl
170: ds1FDL
171: pos
172: dvbAsiIn
173: dvbAsiOut
174: plc
175: nfas
176: tr008
177: gr303RDT
178: gr303IDT
179: isup
180: propDocsWirelessMaclayer
181: propDocsWirelessDownstream
182: propDocsWirelessUpstream
183: hiperlan2
184: propBWAp2Mp
185: sonetOverheadChannel
186: digitalWrapperOverheadChannel
187: aal2
188: radioMAC
189: atmRadio
190: imt
191: mvl
192: reachDSL
193: frDlciEndPt
194: atmVciEndPt
195: opticalChannel
196: opticalTransport
197: propAtm
198: voiceOverCable
199: infiniband
200: teLink
201: q2931
202: virtualTg
203: sipTg
204: sipSig
205: docsCableUpstreamChannel
206: econet
207: pon155
208: pon622
209: bridge
210: linegroup
211: voiceEMFGD
212: voiceFGDEANA
213: voiceDID
214: mpegTransport
215: sixToFour
216: gtp
217: pdnEtherLoop1
218: pdnEtherLoop2
219: opticalChannelGroup
220: homepna
221: gfp
222: ciscoISLvlan
223: actelisMetaLOOP
224: fcipLink
225: rpr
226: qam
227: lmp
228: cblVectaStar
229: docsCableMCmtsDownstream
230: adsl2
231: macSecControlledIF
232: macSecUncontrolledIF
233: aviciOpticalEther
234: atmbond
235: voiceFGDOS
236: mocaVersion1
237: ieee80216WMAN
238: adsl2plus
239: dvbRcsMacLayer
240: dvbTdm
241: dvbRcsTdma
242: x86Laps
243: wwanPP
244: wwanPP2
245: voiceEBS
246: ifPwType
247: ilan
248: pip
249: aluELP
250: gpon
251: vdsl2
252: capwapDot11Profile
253: capwapDot11Bss
254: capwapWtpVirtualRadio
255: bits
256: docsCableUpstreamRfPort
257: cableDownstreamRfPort
258: vmwareVirtualNic
259: ieee802154
260: otnOdu
261: otnOtu
262: ifVfiType
263: g9981
264: g9982
265: g9983
266: aluEpon
267: aluEponOnu
268: aluEponPhysicalUni
269: aluEponLogicalLink
270: aluGponOnu
271: aluGponPhysicalUni
272: vmwareNicTeam
277: docsOfdmDownstream
278: docsOfdmaUpstream
279: gfast
280: sdci
281: xboxWireless
282: fastdsl
283: docsCableScte55d1FwdOob
284: docsCableScte55d1RetOob
285: docsCableScte55d2DsOob
286: docsCableScte55d2UsOob
287: docsCableNdf
288: docsCableNdr
289: ptm
290: ghn
291: otnOtsi
292: otnOtuc
293: otnOduc
294: otnOtsig
295: microwaveCarrierTermination
296: microwaveRadioLinkTerminal
297: ieee8021axDrni
298: ax25
299: ieee19061nanocom
300: cpri
301: omni
302: roe
303: p2pOverLan
- name: ifPhysAddress
oid: 1.3.6.1.2.1.2.2.1.6
type: PhysAddress48
help: The interface's address at its protocol sub-layer - 1.3.6.1.2.1.2.2.1.6
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifAdminStatus
oid: 1.3.6.1.2.1.2.2.1.7
type: EnumAsStateSet
help: The desired state of the interface - 1.3.6.1.2.1.2.2.1.7
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
enum_values:
1: up
2: down
3: testing
- name: ifOperStatus
oid: 1.3.6.1.2.1.2.2.1.8
type: EnumAsStateSet
help: The current operational state of the interface - 1.3.6.1.2.1.2.2.1.8
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
enum_values:
1: up
2: down
3: testing
4: unknown
5: dormant
6: notPresent
7: lowerLayerDown
- name: ifLastChange
oid: 1.3.6.1.2.1.2.2.1.9
type: gauge
help: The value of sysUpTime at the time the interface entered its current operational
state - 1.3.6.1.2.1.2.2.1.9
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifInOctets
oid: 1.3.6.1.2.1.2.2.1.10
type: counter
help: The total number of octets received on the interface, including framing
characters - 1.3.6.1.2.1.2.2.1.10
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifInDiscards
oid: 1.3.6.1.2.1.2.2.1.13
type: counter
help: The number of inbound packets which were chosen to be discarded even though
no errors had been detected to prevent their being deliverable to a higher-layer
protocol - 1.3.6.1.2.1.2.2.1.13
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifInErrors
oid: 1.3.6.1.2.1.2.2.1.14
type: counter
help: For packet-oriented interfaces, the number of inbound packets that contained
errors preventing them from being deliverable to a higher-layer protocol -
1.3.6.1.2.1.2.2.1.14
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifOutOctets
oid: 1.3.6.1.2.1.2.2.1.16
type: counter
help: The total number of octets transmitted out of the interface, including
framing characters - 1.3.6.1.2.1.2.2.1.16
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifOutDiscards
oid: 1.3.6.1.2.1.2.2.1.19
type: counter
help: The number of outbound packets which were chosen to be discarded even
though no errors had been detected to prevent their being transmitted - 1.3.6.1.2.1.2.2.1.19
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifOutErrors
oid: 1.3.6.1.2.1.2.2.1.20
type: counter
help: For packet-oriented interfaces, the number of outbound packets that could
not be transmitted because of errors - 1.3.6.1.2.1.2.2.1.20
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifOutQLen
oid: 1.3.6.1.2.1.2.2.1.21
type: gauge
help: The length of the output packet queue (in packets). - 1.3.6.1.2.1.2.2.1.21
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
- name: ifConnectorPresent
oid: 1.3.6.1.2.1.31.1.1.1.17
type: EnumAsStateSet
help: This object has the value 'true(1)' if the interface sublayer has a physical
connector and the value 'false(2)' otherwise. - 1.3.6.1.2.1.31.1.1.1.17
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
enum_values:
1: "true"
2: "false"
- name: ifCounterDiscontinuityTime
oid: 1.3.6.1.2.1.31.1.1.1.19
type: gauge
help: The value of sysUpTime on the most recent occasion at which any one or
more of this interface's counters suffered a discontinuity - 1.3.6.1.2.1.31.1.1.1.19
indexes:
- labelname: ifIndex
type: gauge
lookups:
- labels:
- ifIndex
labelname: ifAlias
oid: 1.3.6.1.2.1.31.1.1.1.18
type: DisplayString
- labels:
- ifIndex
labelname: ifDescr
oid: 1.3.6.1.2.1.2.2.1.2
type: DisplayString
- labels:
- ifIndex
labelname: ifName
oid: 1.3.6.1.2.1.31.1.1.1.1
type: DisplayString
We would need to have retry callback hooks in the SNMP library in order to get more per walk debugging. It may be possible, I would have to look at the library code.
We would need to have retry callback hooks in the SNMP library in order to get more per walk debugging. It may be possible, I would have to look at the library code.
if this is possible, it would be great, I'm not sure how else I could easily determine which oid was failing. I may try to decode some of the snmp3 packets to try to see what was requested twice.
Can you give this PR a try? https://github.com/prometheus/snmp_exporter/pull/1157
You can find the output build artifacts in CircleCI.
Can you give this PR a try? #1157
You can find the output build artifacts in CircleCI.
Thank you very much, I was able to build that branch with "go build" and can run the output with "./snmp_exporter --snmp.debug-packets --log.level=debug".
I've been running this build on my desktop for a while, which is accessing that problem device over a VPN, and in 10 minutes I haven't been able to record a duration greater than 600ms in the last 10 minutes. I have curls running with "watch -n 60" so it does one query a minute.
Do you know what I could filter the output by to only see that particular new debug message? That may make it easier to catch this issue with this improved debugging.
I would look for Retry number
and request timeout
in the logs.
For whatever reason I'm not able to reproduce this on my desktop:
> grep "Retry number" log.txt
> grep "request timeout" log.txt
> grep "Finished scrape" log.txt
ts=2024-04-12T16:22:10.522Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.647985982
ts=2024-04-12T16:22:26.162Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.634859661
ts=2024-04-12T16:22:41.775Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.607646046
ts=2024-04-12T16:22:57.411Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.630984308
ts=2024-04-12T16:23:13.042Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.626529731
ts=2024-04-12T16:23:28.661Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.613124682
ts=2024-04-12T16:23:44.301Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.634122488
ts=2024-04-12T16:23:59.934Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.627908045
ts=2024-04-12T16:24:15.549Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.610465316
ts=2024-04-12T16:24:31.166Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.611277663
ts=2024-04-12T16:24:46.799Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.628145301
ts=2024-04-12T16:25:02.422Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.61721881
ts=2024-04-12T16:25:18.060Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.632497305
ts=2024-04-12T16:25:33.698Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.633533768
ts=2024-04-12T16:25:49.326Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.622159064
ts=2024-04-12T16:26:04.950Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.618739644
ts=2024-04-12T16:26:20.579Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.623703638
ts=2024-04-12T16:26:36.203Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.618068765
ts=2024-04-12T16:26:51.821Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.613324038
ts=2024-04-12T16:27:07.453Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.625984028
ts=2024-04-12T16:27:23.092Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.633752763
ts=2024-04-12T16:27:38.735Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.638352054
ts=2024-04-12T16:27:54.368Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.626199045
ts=2024-04-12T16:28:10.009Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.636024974
ts=2024-04-12T16:28:25.630Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.615045888
ts=2024-04-12T16:28:41.265Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.630309173
ts=2024-04-12T16:28:56.892Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.621721412
ts=2024-04-12T16:29:12.546Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.64898922
ts=2024-04-12T16:29:28.179Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.627764007
ts=2024-04-12T16:29:43.820Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.635411245
ts=2024-04-12T16:29:59.454Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.627675049
ts=2024-04-12T16:30:15.075Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.616801067
ts=2024-04-12T16:30:30.718Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.63589801
ts=2024-04-12T16:30:46.359Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.635852961
ts=2024-04-12T16:31:02.014Z caller=collector.go:448 level=debug auth=netgear_switch target=172.16.99.2 worker=0 module=if_mib msg="Finished scrape" duration_seconds=0.649401456
I'm still observing that behavior on the existing docker container. I'll try to run this patched code directly on that server next.
Yea, I typically don't recommend running SNMP over VPN/WAN links. SNMP, being UDP-based, is very sensitive to packet loss and latency.
My standard recommendation is to run the snmp_exporter as close to the target devices as possible.
Yea, I typically don't recommend running SNMP over VPN/WAN links. SNMP, being UDP-based, is very sensitive to packet loss and latency.
My standard recommendation is to run the snmp_exporter as close to the target devices as possible.
The server with the issues is as close as it can be, it's under a router and then that router is attached to this switch.
What is strange is that I'm not experiencing the issue over a VPN.
The topology where the problem is observed: Docker container -> core router -> switch (snmp) The docker host is not NATing or anything, the core router is aware of the docker subnets on that host.
The topology where no issue is observed: My desktop -> vpn -> management router -> switch (snmp)
A different router is being used, but I've checked the core router and it doesn't report any dropped/error packets.
snmp_exporter version: output of
snmp_exporter -version
snmp_exporter | ts=2024-04-11T15:09:38.160Z caller=main.go:194 level=info msg="Starting snmp_exporter" version="(version=0.25.0, branch=HEAD, revision=9c42d6c874d479314e612bca69558c81f8e26287)" concurrency=1 snmp_exporter | ts=2024-04-11T15:09:38.160Z caller=main.go:195 level=info build_context="(go=go1.21.5, platform=linux/amd64, user=root@880115266f70, date=20231210-10:05:18, tags=netgo)"
What device/snmpwalk OID are you using?
IF-MIB:
oid=1.3.6.1.2.1.2
If this is a new device, please link to the MIB(s).
It's a Netgear GS324TP S350
What did you do that produced an error?
Nothing different than what doesn't produce an error.
What did you expect to see?
What did you see instead?
When I check the response, I can see that it says it retied a single packet, but it's not clear what was retried. The logs don't make any mention of what failed. Similar behavior is being observed with a Lenovo Xclarity BMC, also happening every 5 minutes with a single retry.