Open ADovgalyuk opened 3 years ago
We don't have a way to reproduce this error because we don't have your snmp device, but I can share a few ideas about what might be related.
It's possible that building with a new go version could cause this change but I think it's very unlikely. I would expect a compiler problem to affect more than just snmp walk.
In the error log line, everything after performing bulk walk for field CA:
comes from the gosnmp module That leads me to think that this could be the result of a change in the gosnmp module.
I understand from the issue description that Telegraf 1.14.5 and 1.19.3 both work and 1.20.0 doesn't work. Telegraf uses the same version of the gosnmp library in 1.19.3 and 1.20.0, so we can rule out a change in the gosnmp module:
$ git show v1.20.0:go.mod |grep snmp
github.com/gosnmp/gosnmp v1.32.0
$ git show v1.19.3:go.mod |grep snmp
github.com/gosnmp/gosnmp v1.32.0
I noticed that in your tests you used two different telegraf.conf files. The first is telegraf 1.14.5 with Test.conf
and the second is 1.20.0 with RG_LTE_omi.conf
. I wonder if the difference in behavior is related to a difference in configuration. It's hard to learn much from a test when two variables are changing.
Could you repeat your test with versions 1.20.0 and 1.19.3 but using the same config file. Please also post the config you use so we know for sure what telegraf is being asked to do.
Config was the same, I was trying to remove some company specific info for more simple example, but missed second case, sorry. Good news, I was able to solve the problem on my proprietary application by tweaking option related to CyclicTreeWalk.
But if we still want to investigate a problem. Seems I was wrong about 19.3 being ok with that, maybe a miss during multiple versions changing. I've return all params as it was before and checked all major versions starting from 1.14.5 to 1.20, this case was broken somewhere between 1.18 and 1.19. Nothing was done between checks beside installing new version.
[root@SOR-01 telegraf.d]# sudo -u telegraf telegraf --test --config RG_LTE_omi.conf --debug
2021-10-15T09:52:34Z I! Starting Telegraf 1.18.0
2021-10-15T09:52:34Z D! [agent] Initializing plugins
2021-10-15T09:52:34Z D! [agent] Starting service inputs
2021-10-15T09:52:34Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-acclCA"
2021-10-15T09:52:34Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-acclOSTATE"
2021-10-15T09:52:34Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-asclCA"
2021-10-15T09:52:34Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-asclOSTATE"
> OMI_ACCL,CA=Ad.OMI.ACCL.3,Metrics_type=app,app=Protei_RG_LTE,host=SOR-01 state=1i 1634291555000000000
> OMI_ACCL,CA=Ad.OMI.ACCL.2,Metrics_type=app,app=Protei_RG_LTE,host=SOR-01 state=1i 1634291555000000000
> OMI_ACCL,CA=Ad.OMI.ACCL.1,Metrics_type=app,app=Protei_RG_LTE,host=SOR-01 state=1i 1634291555000000000
2021-10-15T09:52:34Z D! [agent] Stopping service inputs
2021-10-15T09:52:34Z D! [agent] Input channel closed
2021-10-15T09:52:34Z D! [agent] Processor channel closed
2021-10-15T09:52:34Z D! [agent] Stopped Successfully
And 1.19
[root@SOR-01 telegraf.d]# sudo -u telegraf telegraf --test --config RG_LTE_omi.conf --debug
ERRO[0000]log.go:120 gosnowflake.(*defaultLogger).Errorf failed to create cache directory. /etc/telegraf/.cache/snowflake, err: mkdir /etc/telegraf/.cache: permission denied. ignored
ERRO[0000]log.go:120 gosnowflake.(*defaultLogger).Errorf failed to open. Ignored. open /etc/telegraf/.cache/snowflake/ocsp_response_cache.json: no such file or directory
2021-10-15T09:52:52Z I! Starting Telegraf 1.19.3
2021-10-15T09:52:52Z D! [agent] Initializing plugins
2021-10-15T09:52:52Z D! [agent] Starting service inputs
2021-10-15T09:52:52Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-acclCA"
2021-10-15T09:52:52Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-acclOSTATE"
2021-10-15T09:52:52Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-asclCA"
2021-10-15T09:52:52Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "PROTEI-AD-OMI-MIB::ad-omi-asclOSTATE"
2021-10-15T09:52:52Z E! [inputs.snmp] Error in plugin: agent 172.26.17.12:4162: gathering table OMI_ACCL: performing bulk walk for field CA: error in unmarshalResponse: error parsing OID Value: invalid OID length
2021-10-15T09:52:53Z E! [inputs.snmp] Error in plugin: agent 172.26.17.12:4162: gathering table OMI_ASCL: performing bulk walk for field CA: error in unmarshalResponse: error parsing OID Value: invalid OID length
2021-10-15T09:52:53Z D! [agent] Stopping service inputs
2021-10-15T09:52:53Z D! [agent] Input channel closed
2021-10-15T09:52:53Z D! [agent] Processor channel closed
2021-10-15T09:52:53Z D! [agent] Stopped Successfully
2021-10-15T09:52:53Z E! [telegraf] Error running agent: input plugins recorded 2 errors
Is there any additional info I can provide?
@reimda gosnmp v1.33.0 has some fixed regarding parsing lengths..
I updated gosnmp to 1.33.0 in https://github.com/influxdata/telegraf/pull/10076. When the CI build completes for that PR, could you test that change and see whether the error that started happening in telegraf 1.20 still happens? Thanks!
The build is here: https://github.com/influxdata/telegraf/pull/10076#issuecomment-963626781. Click the dropdown triangle and choose your platform.
Resolved by #10076 and included in telegraf 1.20, pleas reopen/comment if this did not solve the current issue.
Hello @Hipska, @reimda Sorry for no testing previously, Is it too late to reopen this issue? It's still represented in latest version. Was mentioned again in this issue, but no answer. snmpwalk:
[root@XXX1 telegraf.d]# snmpwalk -v2c -Cc -cpublic 127.0.0.1:3161 .
DIAM-MIB::pcsmIndex.1 = INTEGER: 1
DIAM-MIB::pcsmCA.1 = STRING: "Sg.DIAM.PCSM.2"
DIAM-MIB::pcsmOSTATE.1 = STRING: "ACTIVATE"
End of MIB
[root@XXX1 telegraf.d]# snmpwalk -m ./ -v2c -Cc -cpublic 127.0.0.1:3161 .
Failed to parse MIB file ./
iso.3.6.1.4.1.20873.2.30.1.1.1.1 = INTEGER: 1
iso.3.6.1.4.1.20873.2.30.1.1.100.1 = STRING: "Sg.DIAM.PCSM.2"
iso.3.6.1.4.1.20873.2.30.1.1.4096.1 = STRING: "ACTIVATE"
Config:
[[inputs.snmp]]
agents = [ "127.0.0.1:3161" ]
version = 2
community = "public"
tagexclude = ["agent_host"]
[[inputs.snmp.table]]
name = "Diameter"
inherit_tags = [ "app","Metrics_type" ]
[[inputs.snmp.table.field]]
name = "CA"
is_tag = true
oid = "DIAM-MIB::pcsmCA"
[[inputs.snmp.table.field]]
name = "State"
oid = "DIAM-MIB::pcsmOSTATE"
Testing on 1.18:
[root@MME1 telegraf.d]# telegraf --test --config diam.conf --debug
2022-03-18T08:31:57Z I! Starting Telegraf 1.18.1
2022-03-18T08:31:57Z D! [agent] Initializing plugins
2022-03-18T08:31:57Z D! [agent] Starting service inputs
2022-03-18T08:31:57Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "DIAM-MIB::pcsmCA"
2022-03-18T08:31:57Z D! [inputs.snmp] executing "snmptranslate" "-Td" "-Ob" "DIAM-MIB::pcsmOSTATE"
2022-03-18T08:31:57Z D! [agent] Stopping service inputs
2022-03-18T08:31:57Z D! [agent] Input channel closed
2022-03-18T08:31:57Z D! [agent] Processor channel closed
2022-03-18T08:31:57Z D! [agent] Stopped Successfully
> Diameter,CA=Sg.DIAM.PCSM.2,host=MME1 State=1i 1647592317000000000
Testing on 21.4:
[root@MME1 telegraf.d]# telegraf --test --config MME_diam.conf
2022-03-18T08:35:24Z I! Starting Telegraf 1.21.4
2022-03-18T08:35:24Z I! Loaded inputs: snmp
2022-03-18T08:35:24Z I! Loaded aggregators:
2022-03-18T08:35:24Z I! Loaded processors:
2022-03-18T08:35:24Z W! Outputs are not used in testing mode!
2022-03-18T08:35:24Z I! Tags enabled: host=MME1
2022-03-18T08:35:24Z E! [inputs.snmp] Error in plugin: agent 127.0.0.1:3161: gathering table Diameter: performing bulk walk for field CA: error in unmarshalResponse: error parsing OID Value: invalid OID length
2022-03-18T08:35:24Z E! [telegraf] Error running agent: input plugins recorded 1 errors
Okay, will reopen, please report this back faster next time.
The issue you linked is this exact same issue. So did you try to link another issue? Please let us know which one so we can have a look as well..
Thanks a lot. It was mentioned here: https://github.com/influxdata/telegraf/issues/10654
Hi Team, Any news or can't be reproduced and no ideas then? Still relevant on latest telegraf versions.
On the server I have application with snmp interface support. Snmpwalk example:
And the following conf file:
At least on version 1.14.5 - everything works as expected:
But on version 1.20 and the same config we've faced an error:
I've checked on 1.19.3 and it works there as well. So I can only take a guess that it has something to do with new GO version implemented in 1.20