Closed shdxiang closed 4 years ago
That's a timeout, your device is presumably slow or far away from the exporter.
@brian-brazil They are in the same LAN, maybe the device sent response too slow.
@brian-brazil We could improve the error by checking if err == context.Canceled
.
I have troubleshooted this issue for several days, noticed that it is not a simple timeout, my steps are:
snmp.yml
as:qnap:
get:
- 1.3.6.1.4.1.24681.1.3.1.0
version: 2
auth:
community: public
Start snmp_exporter
on a Linux server(172.30.50.121) in the same LAN with target QNAP(172.30.50.104):
./snmp_exporter --log.level=debug
Run a script which request to snmp_exporter
on 172.30.50.121:
exporter.sh
set -e
while true;
do
curl -m 10 'http://localhost:9116/snmp?module=qnap&target=172.30.50.104'
sleep 1
done
It will report timeout in minutes, and the snmp_exporter
logged:
...
level=debug ts=2020-03-13T02:35:34.258Z caller=main.go:99 module=qnap target=172.30.50.104:161 msg="Starting scrape"
level=debug ts=2020-03-13T02:35:34.259Z caller=collector.go:132 module=qnap target=172.30.50.104:161 msg="Getting OIDs" oids=1
level=info ts=2020-03-13T02:36:34.259Z caller=collector.go:225 module=qnap target=172.30.50.104:161 msg="Error scraping target" err="scrape canceled (possible timeout) getting target 172.30.50.104"
level=debug ts=2020-03-13T02:36:34.259Z caller=main.go:110 module=qnap target=172.30.50.104:161 msg="Finished scrape" duration_seconds=60.000760423
snmpget
, also on 172.30.50.121:snmpget.sh:
set -e
while true;
do
snmpget -t 5 -v 2c -c public 172.30.50.104 1.3.6.1.4.1.24681.1.3.1.0
sleep 1
done
But this script never timeout.
tcpdump
, attached them, please help to analyze:
tcpdump.zip.
Those look identical on the wire, so this is probably some networking issue on the box itself. I'd suggest checking your routing setup and which interfaces are being used.
@brian-brazil I am confused, If the NAS did not work well, snmpget
should timeout too, right?
netsnmp does some non-standard network stuff iirc, something about not checking that packets are coming back from where you sent them to. So it may work in situations where your routing is actually broken.
Actually the test server and the QNAP NAS are connected to the same switch, so I think there should not have routing issue.
I'd suggest using strace then to see if the packets are making it to the snmp exporter.
I have run tcpdump
on the NAS itself, and it captured the response, but the tcpdump
on the Linux server which run snmp_exporter
did not get that response, so this should be a network issue. Thanks.
Host operating system: output of
uname -a
Docker container: Linux d8827dda937a 4.19.95-flatcar #1 SMP Sat Feb 8 07:25:12 -00 2020 x86_64 GNU/Linux
snmp_exporter version: output of
snmp_exporter -version
version=0.17.0, branch=HEAD, revision=f0ad4551a5c2023e383bc8dde2222f47dc760b83 f0ad4551a5c2023e383bc8dde2222f47dc760b83
What device/snmpwalk OID are you using?
generator.yml
If this is a new device, please link to the MIB(s).
Download from QNAP itself.
What did you do that produced an error?
I can get the metrics, but sometimes prometheus says that target is unreachable. And there are several error in
snmp-exporter
logs.What did you expect to see?
No error in logs.
What did you see instead?