Closed ErantD closed 4 years ago
Now i set by follow scheme: Logstash -> logfile -> Telegraf Logparcer And also wait for native realization.
the gosnmp
that's used by telegraf already has a trap listening implementation.
see: https://github.com/soniah/gosnmp/blob/master/trap.go
maybe it's the way to go
since i don't know Golang, I was thinking on recurring to snmptrapd
and snmptt
sending logs to telegraf syslog input plugin, but that's a lot of moving parts
+1
This would nicely complement the syslog receiver which Telegraf 1.7 gained, especially if it could be used with Chronograf's integrated log viewer.
An issue here is how to break up the information contained within SNMP trap records. Maybe these should be dealt with the same as syslog structured data fields, creating dynamic timeseries (columns) as required.
For example, a linkDown trap might arrive like this in tcpdump:
127.0.0.1.57932 > 127.0.0.1.162: { SNMPv2c { V2Trap(108) R=97760785 .1.3.6.1.2.1.1.3.0=20587570 .1.3.6.1.6.3.1.1.4.1.0=.1.3.6.1.6.3.1.1.5.3 .1.3.6.1.2.1.2.2.1.2="eth0" .1.3.6.1.2.1.2.2.1.7=1 .1.3.6.1.2.1.2.2.1.8=1 } }
which could be decoded to the following JSON:
{
"sysUpTimeInstance": 20587570,
"snmpTrapOID.0": "linkDown",
"ifDescr": "eth0",
"ifAdminStatus": "up",
"ifOperStatus": "up"
}
Not all of this information is necessarily important; but if it contains ifIndex or ifDescr, you'll need it to identify the interface this event relates to.
yes this would be nice , like now one has to go prtg datasource and grafana instead of just native into influx
+1
Hi...this is so easy request ..you can develop it?? Its so usefull and will help a lot!!!
Getting our enterprise acceptance of using Telegraf/Influx to replace a 'legacy' solution has this as a pre-requisite. Is there anything that we can do to help this along?
I think I'll try snmptrapd to syslog. Syslog already has a rich input plugin, and in this way, I will get the full power of snmptrapd to manage what the record will look like before I drop it into the db. At first, I was thinking of using snmptrapd to dump into an influx data formatted file, however, I'm worried about escape characters breaking this, or having unintended consequences. It would be nice to have a direct trap input, but I understand why this has not been implemented.
Question for everyone: are you interested only in TRAP support or would INFORM also be useful? If you are interested in INFORM support is it a need or a want?
@bruceschaller Would love to hear more about your setup once you have it working, it sounds like a pretty good idea to me.
For what it’s worth trap support the need inform would be useful, but if we could get trap working without the use of syslog and parsing I’d be super interested!
I personally would love inform support. Right now we have traps that are being sent to a docker container that is running syslog-ng and snmptrapd - but not able to get it to work with V3. Biggest issue has been the engine ID and configuring it to be apart of our automation system.
After doing a lot of research, we have been able to figure out how to get our traps formatted for Slack and Wavefront but would be nice to have it integrated with our Telegraf SNMP monitoring solution as well.
If there are questions or details I can provide to make this happen please let me know.
On Thu, May 23, 2019, 8:13 PM Daniel Nelson notifications@github.com wrote:
Question for everyone: are you interested only in TRAP support or would INFORM also be useful? If you are interested in INFORM support is it a need or a want?
@bruceschaller https://github.com/bruceschaller Would love to hear more about your setup once you have it working, it sounds like a pretty good idea to me.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/influxdata/telegraf/issues/4377?email_source=notifications&email_token=AHEEGKHVISY7CGCUKTLHUB3PW5MNFA5CNFSM4FIJRVUKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWEBGUY#issuecomment-495457107, or mute the thread https://github.com/notifications/unsubscribe-auth/AHEEGKEXGE3NJVUQOZGCHMLPW5MNFANCNFSM4FIJRVUA .
@danielnelson TRAP and INFORM are a need
This link ( https://www.influxdata.com/what-is-snmp/ ) indicates that the "Telegraf SNMP Input Plugin" supports receiving traps:
Telegraf can be deployed with SNMP Input Plugin configured to fetch or listen to specific OIDs. The plugin contains a SNMP traps receiver, which fetches traps from the managed device. Then, Telegraf batches the resulting data and streams it to InfluxDB.
Is that because the feature got developed already? Or is it a different component than being discussed here?
Is there any update on this? We also are being asked to get rid of COTS and this feature is the only large gap for a TICK stack versus our current stuff.
Thanks!
Thanks for pointing out this page, unfortunately it is not accurate and currently there is no TRAP support in Telegraf. However, it is something we are working on.
I'll have someone correct that link to avoid confusion.
Glad to see this is active! I was actually just looking to see if it had already been done. I'm a professional Go developer who's interested in helping; is there anything public available I could contribute to/collaborate on?
@superdave Do you think you could investigate adding support for INFORM to gosnmp? From a cursory look at the library I don't think it is supported currently. If we can receive them and acknowledge in a similar way to TRAP, it would be very helpful.
I could look! I don't know if I have any current devices that use it, but I imagine I can get one of the various SNMP daemons out there to throw one out.
Sorry for the delay on this, but I'm starting to look into this tonight. Looks like net-snmp
has INFORM functionality, so it should be easy enough to test from there, and someone's contacted me in re: running tests with real devices that have it (all my SNMP devices are old-ish and I haven't seen much in the way of INFORM functionality in their docs, so).
VERY preliminary, definitely not ready for merge yet, but it looks like INFORM is actually pretty simple; it's just a TRAP that expects a pretty-much identical response. This works so far with snmptrap -v 2c -c public -Ci <host> 23 coldStart.0
: https://github.com/soniah/gosnmp/pull/202
Hi @superdave -- just wanted to check on any update regarding the snmptraps with Telegraf. Is there something available that maybe I can test out? would this be a separate input plugin for telegraf? wanted to see if there is anything I can do to help.
I have next to no information on the Telegraf implementation, I've just gotten some initial INFORM support added to the go-snmp
package in a PR. I haven't had time to add proper testing or examine exact compliance with the spec (particularly the response I send), though it seems to work with net-snmp
. Feel free to test that out with your devices! As soon as I have a chance to get back to it, I'll try and wrap up my PR and see if the maintainer will merge it.
We just merged a pull request that adds an initial snmp trap input plugin, we'd love to get some early testing on this plugin and feedback on experience and if it meets your requirements.
Unfortunately, @superdave's INFORM support is not added upstream yet, so for now this can handle unconfirmed traps only.
Plugin documentation: snmp_trap
Sadly, I haven't even heard initial feedback about my PR there yet, though in all honesty it needs tests added. I'll try to get those in soon and see if it gets the attention of the maintainer. I'm very excited to try out the regular trap functionality, though! That's about 95% of my use cases.
Everyone: now that the new 1.13.0-rc1 build is out, please switch over to these packages for testing.
@superdave I'll review your PR on gosnmp, I'm no expert on SNMP but maybe that will help it get some motion.
Right, it's obviously not ready for prime time yet, but I figured a WIP PR was better than nothing.
Thanks for the review, of course! Good catches, I'll implement those shortly and hopefully get some better testing in.
I have done test with the plugin since the beginning of the week and it work very well. The only problem that i have is very specific to me and i'm not sur if it's an issue with the SNMP trap plugin or something else.
When the trap is send, there is HEX part in the trap that is transform in text when written to the DB. I'm not sure in witch part off the process it is done. But I say that because at the reception off the trap, in snmptrapd, i can see the raw information that i need like this:
Dec 6 20:59:23 SNMPTraps snmptrapd[6123]: 2019-12-06 20:59:23
In InluxDB it look like this: time last_rcGponSystem.3.1.1.2.293666817
1575650812964027095 RCMG�}
I know that the HEX string is not a real hex. But we have to deall with this stuff... (It's cheaper) We would need it not to be altered and be written as raw information in the db.
Is anybody could help point me were is the right place to post my problem?
Thanks
@AlexTargo Thanks for the feedback and testing, let's open this up as a new feature issue.
It could be helpful to add a packet capture to the issue as well. Could you run tcpdump
in the background while receiving a trap with a similar command line and include it in the issue:
sudo tcpdump -s 0 -i eth0 -w snmptrap.pcap port 162
Also include the output of Telegraf, (you can ctrl-c it after the trap is received and the output is printed):
telegraf --input-filter=snmp_trap --test --test-wait=600
One issue I came across while testing is that some traps do not contain variables in its definition, so telegraf won't treat them. This is the case for the classic coldStart or ucdShutdown.
You may reproduce this by sending a coldstart trap:
snmptrap -v 1 -c public 127.0.0.1 coldStart
@tesibelda Thanks for the report, it looks like any v1 traps without variables won't be collected since there are no produced fields. To fix, I think we should convert the "time-stamp" from the Trap-PDU into the sysUpTime parameter, along the lines as described in rfc2576. This will guarantee us a field and improve the compatibly between v1 and v2 traps.
@tesibelda We added support for v1 traps in #6786, hoping to do a final release of 1.13.0 tomorrow but if you have time and could take a look at 1.13.0-rc3 it would be really appreciated.
@danielnelson Sorry for the delay. The idea of using sysUpTimeInstance field is great. Also including the timeout option since during my tests it took more than 5s when using several thousand files in the MIBDIR. For my tests the 1.13 release works just as expected. Great job!
It would be nice to improve the performance of the caching. I'm curious what the approximate size of your MIBs is, could you run:
wc /usr/share/snmp/mibs/* | grep total
I tested it in a Windows laptop, with different number of files in \usr\share\snmp\mibs folder. There were timeouts with 11919 MIB files (from http://mibs.snmplabs.com/asn1/), I have also tried with 4622 MIB files even with 30s timeout configured. No timeout when using 322 MIB files. After the first time, cache works fine. Using netsnmp 5.7 Windows binaries and axNetworkTrunkPortsThreshold trap from A10. I guess this is more a performance issue of snmptranslate. I do not have more recent Windows binaries, but I will try it on Linux with 5.8 binaries.
We are mostly trying to gauge what the requirements are for load performance at this point, is this your normal loadout of MIBs? Would also be nice to do a simple snmptranslate with the MIBs for comparison. We have an existing issue, #5720, for MIB parsing performance, can you respond on that issue?
Question for everyone: are you interested only in TRAP support or would INFORM also be useful? If you are interested in INFORM support is it a need or a want?
@bruceschaller Would love to hear more about your setup once you have it working, it sounds like a pretty good idea to me.
Hello Bruceshaller, We did run in an issue indeed with inform and would really like to have inform support and snmpv2/3 support for trap/inform messages.
Hi @neeles83, we currently are working on improving our v3 support. If you can comment issues and improvements on #6918 that would be really helpful.
Hi everyone,
Today, I tried to configure my telegraf to receive SNMPv3 traps with the following simple configuration:
[[inouts.snmp_trap]]
service_address = "udp://:162"
The router which I try to monitor is configured with the following:
Router(config-if)#snmp-server enable traps snmp linkdown linkup Router(config-if)#snmp-server host IpOfMyTelegrafManager version 3 priv myV3Username snmp
As you can see, I try to get traps about link up/down my interfaces so when I shut down an interface, I can see with wireshark that the trap is sent with many OID. to my manager. Of course, before shutting down the interface, I started the listening on port 162 in Telegraf like that :
telegraf --input-filter=snmp_trap --output-filter file --test --test-wait=1500
And the screen blocks on the following:
2020-04-01T16:59:23Z I! Starting Telegraf 1.13.4 2020-04-01T16:59:23Z I! Using config file: /etc/telegraf/telegraf.conf 2020-04-01T16:59:23Z I! [inputs.snmp_trap] Listening on udp://:162
But I receive nothing, no traps appear in the terminal. So first, I thought that it was snmptranslate which didn't have the correct MIB so I checked all OIDs I saw in trap packet in wireshark with snmptranslate command to ensure that it was not the problem and that wasn't. After that, I checked with
lsof -i -P -n
that the udp port n°162 was in listening and it was. So now, I'm looking everywhere in the Internet to find the solution but I find nothing....
Two details : First, I have already enter the command setcap cap_net_bind_service=+ep /usr/bin/telegraf
and second, I can perfectly make snmp request from manager to host.
Is there someone who know what is the problem and can help me please ?
I'm sorry if it's not the place for a question like that, I'm new on github ^^
Thanks for your help.
@volkan05 Support for v3 traps isn't available yet, so unfortunately it won't work just yet. Keep an eye on #6918 for updates on when it will be supported.
@danielnelson All right, thank you for your response.
@danielnelson Thank you for adding this new functionality, we have been wanting this also. @ErantD is there additional work needed to have Kapacitor alert on these traps (including details of the trap), or is it a matter of crafting a TICKscript to do this?
Feature Request
Opening a feature request kicks off a discussion.
Proposal:
Have telegraf receive passive snmp traps and send them to influxdb.
Current behavior:
Currently telegraf can get active snmp, via queries to a host, but host send snmp are not collectable with telegraf. Example would be an snmp trap sent because of a failed power supply, telegraf does not collect the trap and therefore is not able to forward it to influxdb.
Desired behavior:
Telegraf to receive and forward snmp traps from hosts to influxdb. ie. 1.) host send traps to telegraf because of a power supply failure. 2.) telegraf receives trap, translates it to influxdb line language using vendor MIB 3.) sends proper influxdb input to influxdb. 4.) to make it perfect, Kapacitor shows alert.
Use case: [Why is this important (helps with prioritizing requests)]
We are monitoring lots of systems, and we want to centralize all monitoring to one monitoring system. To make this complete with influxdb, grafana and kapacitor, we are missing the snmp trap from hosts, like PSU failover, disk failure, node reboot, etc..