Closed beparmentier closed 2 years ago
Hi, I am getting also issues with SNMP parsing after updating Telegraf to 1.21.1 on Ubuntu 20.04:
Dec 24 16:59:03 snmpdb telegraf[173]: 2021-12-24T15:59:03Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"snmpdb", Flush Interval:10s
Dec 24 16:59:03 snmpdb telegraf[173]: panic: runtime error: invalid memory address or nil pointer dereference
Dec 24 16:59:03 snmpdb telegraf[173]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x2c0a113]
Dec 24 16:59:03 snmpdb telegraf[173]: goroutine 1 [running]:
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/internal/snmp.LoadMibsFromPath.func1({0x4deff4e, 0x14}, {0x0, 0x0}, {0x0, 0x2c09920})
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/internal/snmp/translate.go:60 +0x53
Dec 24 16:59:03 snmpdb telegraf[173]: path/filepath.Walk({0x4deff4e, 0x14}, 0xc000bed640)
Dec 24 16:59:03 snmpdb telegraf[173]: #011/usr/local/go/src/path/filepath/path.go:503 +0x50
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/internal/snmp.LoadMibsFromPath({0xc000d34260, 0x1, 0xc000bed6c8}, {0x584e030, 0xc000d416e0})
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/internal/snmp/translate.go:57 +0x29a
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/plugins/inputs/snmp.(*Snmp).Init(0xc00033edc0)
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:102 +0x50
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/models.(*RunningInput).Init(0xc000bed758)
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/models/running_input.go:82 +0x35
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/agent.(*Agent).initPlugins(0xc0001113a0)
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:189 +0x96
Dec 24 16:59:03 snmpdb telegraf[173]: github.com/influxdata/telegraf/agent.(*Agent).Run(0xc0001113a0, {0x57e5688, 0xc0002ca680})
Dec 24 16:59:03 snmpdb telegraf[173]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:105 +0x185
Dec 24 16:59:03 snmpdb systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT
Dec 24 16:59:03 snmpdb systemd[1]: telegraf.service: Failed to kill control group /system.slice/telegraf.service, ignoring: Input/output error
Dec 24 16:59:03 snmpdb systemd[1]: message repeated 3 times: [ telegraf.service: Failed to kill control group /system.slice/telegraf.service, ignoring: Input/output error]
Dec 24 16:59:03 snmpdb systemd[1]: telegraf.service: Failed with result 'exit-code'.
Dec 24 16:59:03 snmpdb systemd[1]: telegraf.service: Scheduled restart job, restart counter is at 4.
Dec 24 16:59:03 snmpdb systemd[1]: Stopped The plugin-driven server agent for reporting metrics into InfluxDB.
Dec 24 16:59:03 snmpdb systemd[1]: Failed to attach 184 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Dec 24 16:59:03 snmpdb systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Dec 24 16:59:03 snmpdb systemd[184]: Failed to attach 184 to compat systemd cgroup /system.slice/telegraf.service: No such file or directory
Dec 24 16:59:03 snmpdb influxd-systemd-start.sh[80]: InfluxDB API unavailable after 2 attempts...
It would be great if Telegraf could parse the mibs you get when you download the operating systems SNMP package.
Downgrading to Telegraf 1.20.4 (git: HEAD 34ad5aa1) solves the issue for me.
I also downgraded telegraf to version 1.20.4 and it solves the issue.
Same here. Downgrade temporarily fixes the problem.
However, the issue seems to come from the fact that on Ubuntu, default path
is set to ["/usr/share/snmp/mibs"]
, which does not exists. It causes the application to crash.
However, to fix it:
snmp-mibs-downloader
, set path = ["/var/lib/snmp/mibs/"]
path = []
To me, the real bug is this one:
Steps to reproduce:
path
configExpected behavior
Observed behaviour:
Since https://github.com/influxdata/telegraf/commit/e00147ded38af30f945935ad626aa8be71700b6d, telegraf doesn't panic if the default path doesn't exist, but it will exit with an error.
https://github.com/influxdata/telegraf/pull/10354 makes it log a warning and continue. It also adds a test which would have caught the issue.
I haven't had a panic. I just wish telegraf parsed MIBs it used to parse back on 1.20.
I haven't had a panic. I just wish telegraf parsed MIBs it used to parse back on 1.20.
That's right, I was mislead by a comment showing the panic. I created a new issue for the panic (now just an error, since https://github.com/influxdata/telegraf/commit/e00147ded38af30f945935ad626aa8be71700b6d): https://github.com/influxdata/telegraf/issues/10355. And linked the PR to the new issue instead of this one.
telegraf 1.21.2 released which should fix this issue... testing it asap... https://github.com/influxdata/telegraf/releases/tag/v1.21.2
...after updating to 1.21.2 I am getting another error:
2022-01-05T21:02:06Z E! [telegraf] Error running agent: could not initialize input inputs.snmp: Filepath could not be walked: no mibs found
and telegraf is crashing.
Downgrading to 1.20.4 solves the issue again.
@jostrasser can you please file a new bug with all the required information and I'm sure someone can look at it.
@jostrasser can you please file a new bug with all the required information and I'm sure someone can look at it.
sure, I'll open a new one. Thanks for your support.
Hi @powersj opened this new issue: https://github.com/influxdata/telegraf/issues/10387
Thanks!
Hello,
Just having some warnings after updating to version 1.21.2 :
Jan 7 10:18:48 odin telegraf[256859]: 2022-01-07T09:18:48Z I! Starting Telegraf 1.21.2
Jan 7 10:18:56 odin telegraf[256859]: Parse module: /var/lib/snmp/mibs/ietf/DPI20-MIB:9:4: unexpected "ibm" (expected ";")
Jan 7 10:18:59 odin telegraf[256859]: Parse module: /var/lib/snmp/mibs/ietf/HPR-MIB:494:30: unexpected "HprRtpCounter" (expected "}")
Jan 7 10:19:07 odin telegraf[256859]: Parse module: /var/lib/snmp/mibs/ietf/SNMPv2-PDU:73:1: unexpected "max-bindings" (expected "END")
Jan 7 10:19:08 odin telegraf[256859]: Parse module: /var/lib/snmp/mibs/ietf/TCPIPX-MIB:63:12: unexpected "tcpIpxConnLocalPort" (expected "}")
2022-01-07T09:18:56Z W! [inputs.snmp] module DPI20-MIB could not be loaded
2022-01-07T09:18:59Z W! [inputs.snmp] module HPR-MIB could not be loaded
2022-01-07T09:19:07Z W! [inputs.snmp] module SNMPv2-PDU could not be loaded
2022-01-07T09:19:08Z W! [inputs.snmp] module TCPIPX-MIB could not be loaded
But without panic and my snmp inputs now work fine with the latest version.
Thanks
Thank you for the update! This is an issue with the syntax of these mibs. It is talked about with the maintainer of gosmi
in this issue. Hope this clears up any confusion :)
Thank you for the update! This is an issue with the syntax of these mibs. It is talked about with the maintainer of
gosmi
in this issue. Hope this clears up any confusion :)
Do you really expect everyone to start debugging MIB files? If so, the barrier to entry for InfluxDB and Telegraf just got way higher.
@beparmentier can you please try the nightly build to see if your problem still exists?
@MyaLongmire hello, I installed the nighly build and I don't get Parse module
errors anymore
root@odin:~# telegraf --version
Telegraf 1.22.0-0a379a5c (git: master 0a379a5c)
I still get warnings in telegraf log about mibs
2022-01-12T08:15:39Z W! [inputs.snmp] module DPI20-MIB could not be loaded
2022-01-12T08:15:42Z W! [inputs.snmp] module HPR-MIB could not be loaded
2022-01-12T08:15:51Z W! [inputs.snmp] module SNMPv2-PDU could not be loaded
2022-01-12T08:15:53Z W! [inputs.snmp] module TCPIPX-MIB could not be loaded
But its not a problem for me because I don't use them. ietf/rfc still provide bugged mibs files but now telegraf properly handle them so all we can do is to try to ask them to fix their files.
To me the telegraf issue is now solved.
Thanks !
@beparmentier thank you for your response! Since you filed the issue and it is now fixed I am going to close this. if anyone else is still having problems open another ticket and someone on the team with assist you :)
Relevent telegraf.conf
Logs from Telegraf
Dec 24 11:56:37 odin telegraf[85513]: 2021-12-24T10:56:37Z I! Starting Telegraf 1.21.1 Dec 24 11:56:44 odin systemd[1]: Stopping The plugin-driven server agent for reporting metrics into InfluxDB... Dec 24 11:56:45 odin telegraf[85513]: Parse module: /var/lib/snmp/mibs/ietf/DPI20-MIB:9:4: unexpected "ibm" (expected ";") Dec 24 11:56:48 odin telegraf[85513]: Parse module: /var/lib/snmp/mibs/ietf/HPR-MIB:494:30: unexpected "HprRtpCounter" (expected "}") Dec 24 11:56:56 odin telegraf[85513]: Parse module: /var/lib/snmp/mibs/ietf/SNMPv2-PDU:73:1: unexpected "max-bindings" (expected "END") Dec 24 11:56:58 odin telegraf[85513]: Parse module: /var/lib/snmp/mibs/ietf/TCPIPX-MIB:63:12: unexpected "tcpIpxConnLocalPort" (expected "}") Dec 24 11:56:59 odin telegraf[85513]: panic: strconv.ParseUint: parsing "": invalid syntax Dec 24 11:56:59 odin telegraf[85513]: goroutine 1 [running]: Dec 24 11:56:59 odin telegraf[85513]: github.com/sleepinggenius2/gosmi/types.OidMustFromString(...) Dec 24 11:56:59 odin telegraf[85513]: #011/go/pkg/mod/github.com/sleepinggenius2/gosmi@v0.4.3/types/oid.go:91 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/internal/snmp.GetIndex({0x4fab394, 0x1}, {0x40046ad128, 0xe}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/internal/snmp/translate.go:126 +0x348 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/plugins/inputs/snmp.snmpTableCall({0x4000bdd821, 0x15}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:837 +0xcc Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/plugins/inputs/snmp.snmpTable({0x4000bdd821, 0x15}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:820 +0x1b0 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/plugins/inputs/snmp.(Table).initBuild(0x4000592860) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:192 +0x40 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/plugins/inputs/snmp.(Table).Init(0x4000592860) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:162 +0x7c Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/plugins/inputs/snmp.(Snmp).Init(0x4000023600) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/plugins/inputs/snmp/snmp.go:110 +0x118 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/models.(RunningInput).Init(0x4000bf3090) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/models/running_input.go:82 +0x58 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/agent.(Agent).initPlugins(0x400000e608) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:189 +0x74 Dec 24 11:56:59 odin telegraf[85513]: github.com/influxdata/telegraf/agent.(Agent).Run(0x400000e608, {0x50f04e8, 0x400078e040}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/agent/agent.go:105 +0x140 Dec 24 11:56:59 odin telegraf[85513]: main.runAgent({0x50f04e8, 0x400078e040}, {0x7b610f8, 0x0, 0x0}, {0x7b610f8, 0x0, 0x0}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:312 +0xcf0 Dec 24 11:56:59 odin telegraf[85513]: main.reloadLoop({0x7b610f8, 0x0, 0x0}, {0x7b610f8, 0x0, 0x0}) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:147 +0x220 Dec 24 11:56:59 odin telegraf[85513]: main.run(...) Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf_posix.go:8 Dec 24 11:56:59 odin telegraf[85513]: main.main() Dec 24 11:56:59 odin telegraf[85513]: #011/go/src/github.com/influxdata/telegraf/cmd/telegraf/telegraf.go:485 +0xcb8 Dec 24 11:56:59 odin systemd[1]: telegraf.service: Main process exited, code=exited, status=2/INVALIDARGUMENT Dec 24 11:56:59 odin systemd[1]: telegraf.service: Failed with result 'exit-code'.
System info
Telegraf 1.21.1, Ubuntu 20.04
Docker
No response
Steps to reproduce
Just use the configuration I provide.
Expected behavior
Collect snmp metrics
Actual behavior
Telegraf panic
Additional info
I use this configuration for few months without any problem. Problem started when I updated telegraf from version
1.20.4
to version1.21.1
I tried to update MIBs file using
download-mibs
fromsnmp-mibs-downloader
package but without any success.