influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.63k stars 5.58k forks source link

Ping monitoring limitation #6360

Closed rguptarg closed 5 years ago

rguptarg commented 5 years ago

Hi Team,

Can you please suggest, how many IP's I can give in inputs.ping plugin?

As I have 4000 IP's, so can I give all IP's under one telegraf config file?

danielnelson commented 5 years ago

This is more IP addresses than I have ever tested with, you may want to work your way up to this number by starting at 100 hosts and double it until you have all hosts.

All the hosts can go into a single config file, its also possible to split them into several files each with it's own ping plugin if this is more convienient.

You will definitely want to use method = "native", new in 1.12.0, as I am sure the "exec" method won't be able to handle this many hosts. Make sure to setup the right permissions.

I'd be interested in hearing about your experience, let me know how it goes.

rguptarg commented 5 years ago

Thanks for update!!!

I'll try in new version 1.12.0 and update you accordingly..

rguptarg commented 5 years ago

Hi Daniel

As i can see in 1.12.0, there is no such option method = "native" under ping plugin. I am considering, it's under native mathod by default. Currently i have added 2000+ IP's and getting below error :-

ping,IP=10.5.112.175,application_name=management,host=N2VL-PA-INFL04,url=10.222.166.38 result_code=2i 1568014270000000000 2019-09-09T07:31:10Z E! [inputs.ping] Error in plugin: host 10.5.233.99: open /dev/null: too many open files ping,IP=10.5.112.175,application_name=management,host=N2VL-PA-INFL04,url=10.5.78.133 result_code=2i 1568014270000000000 2019-09-09T07:31:10Z E! [inputs.ping] Error in plugin: host 10.5.76.103: pipe2: too many open files ping,IP=10.5.112.175,application_name=management,host=N2VL-PA-INFL04,url=10.5.233.99 result_code=2i 1568014270000000000 2019-09-09T07:31:10Z E! [inputs.ping] Error in plugin: host 10.5.233.100: open /dev/null: too many open files ping,IP=10.5.112.175,application_name=management,host=N2VL-PA-INFL04,url=10.5.76.103 result_code=2i 1568014270000000000 2019-09-09T07:31:10Z E! [inputs.ping] Error in plugin: host 10.5.232.45: open /dev/null: too many open files ping,IP=10.5.112.175,application_name=management,host=N2VL-PA-INFL04,url=10.5.233.100 result_code=2i 1568014270000000000 2019-09-09T07:31:10Z E! [inputs.ping] Error in plugin: host 10.5.232.64: pipe2: too many open files

Please let me know if any

rguptarg commented 5 years ago

Please let me know if any changes required, currently I am using by default configuration.

And I have tried with 1000 IP's and it's showing proper result.

danielnelson commented 5 years ago

As i can see in 1.12.0, there is no such option method = "native" under ping plugin.

This option should be available, I double checked and I see it shown in the example configuration.

Setting it to native will use a Go implementation of ping, instead of running the ping executable on the system, and should reduce the number of files that need to be opened in order to ping.

rguptarg commented 5 years ago

Hi Daniel,

Yes, I have reinstall telegraf and now I can see method option in ping plugin. I have tested ping plugin with 1K IP's But still I am getting error in telegraf status command.

Sep 10 14:40:01 N2VL-PA-INFL04 telegraf[23108]: 2019-09-10T09:10:01Z E! [inputs.ping] Error in plugin: error listening for ICMP packets: listen ip4:icmp : socket: operation not permitted: socket: permission denied

OS version :- Red Hat Enterprise Linux Server release 7.6 (Maipo)

# # Ping given url(s) and return statistics
 [[inputs.ping]]
#   ## List of urls to ping
   urls = [ 1000 Ip's ]
#
#   ## Number of pings to send per collection (ping -c <COUNT>)
#   # count = 1
#
#   ## Interval, in s, at which to ping. 0 == default (ping -i <PING_INTERVAL>)
#   # ping_interval = 1.0
#
#   ## Per-ping timeout, in s. 0 == no timeout (ping -W <TIMEOUT>)
#   # timeout = 1.0
#
#   ## Total-ping deadline, in s. 0 == no deadline (ping -w <DEADLINE>)
#   # deadline = 10
#
#   ## Interface or source address to send ping from (ping -I[-S] <INTERFACE/SRC_ADDR>)
#   # interface = ""
#
#   ## How to ping. "native" doesn't have external dependencies, while "exec" depends on 'ping'.
#   # method = "exec"
      method = "native"
#
#   ## Specify the ping executable binary, default is "ping"
#       # binary = "ping"
#
#   ## Arguments for ping command. When arguments is not empty, system binary will be used and
#   ## other options (ping_interval, timeout, etc) will be ignored.
#   # arguments = ["-c", "3"]
#
#   ## Use only ipv6 addresses when resolving hostnames.
#   # ipv6 = false
danielnelson commented 5 years ago

When using the native method you will need to add additional permissions to Telegraf:

sudo setcap cap_net_raw=eip /path/to/telegraf
rguptarg commented 5 years ago

Hi Daniel,

What's the meaning of "/path/to/telegraf" ? should I use telegraf configuration file path or any other path. because I am getting error when run same command :-

[root@N2VL-PA-INFL04 ~]# setcap cap_net_raw=eip /path/to/telegraf Failed to set capabilities on file `/path/to/telegraf' (No such file or directory) usage: setcap [-q] [-v] (-r|-|) [ ... (-r|-|) ]

Note must be a regular (non-symlink) file.

glinton commented 5 years ago

/path/to/telegraf refers to the actual location (absolute path) of the telegraf binary. This can generally be found by running which telegraf, however you must verify that it is not a symlink.

For example, if you see something like the following:

$ which telegraf
/home/yourUser/bin/telegraf
$ ls -la /home/yourUser/bin/telegraf
lrwxrwxrwx  1 yourUser yourUser   48 Jun  2 09:29 /home/yourUser/bin/telegraf -> /some/other/location/telegraf

The path to telegraf is /some/other/location/telegraf (note the -> in the ls output signifying the symlink)

rguptarg commented 5 years ago

Thanks for update.

Now I have configured 4K Ip's for ping monitoring, and Now system is under observation.

I'll update you tomorrow

rguptarg commented 5 years ago

Hi,

Currently Ping plugin working and collecting proper data , but I am getting error in telegraf status :-

Sep 12 10:05:01 N2VL-PA-INFL04 telegraf[21196]: 2019-09-12T04:35:01Z E! [inputs.ping] Error in plugin: error listening for ICMP packets: listen ip4:icmp : socket: too many open files: socket: permission denied

I have tried "sysctl -w net.ipv4.ping_group_range="0 2147483647"" command but still after some time again getting same error.

danielnelson commented 5 years ago

Let's try creating an override file for the Telegraf service, assuming you are using systemd:

systemctl edit telegraf.service

This will open your editor, add this to the file:

[Service]
LimitNOFILE=8192

Reload the system configs and restart Telegraf:

systemctl daemon-reload
systemctl restart telegraf
rguptarg commented 5 years ago

Hi

Thanks for Update!!!!!

Now Ping Plugin is working fine for 4K IP's .

Now I have few questions :-

danielnelson commented 5 years ago

That's great, I'll update the documentation with what we learned. Unfortunately though, I don't know the answer to your questions. I'm not aware of anyone using Telegraf to ping more hosts than you are right now.

rguptarg commented 5 years ago

Hi Daniel

Can you please help me to understand one output :-

I am using Native method for ping plugin and if url/IP is not pingable then result code is coming 2, it should be one (https://github.com/influxdata/telegraf/tree/master/plugins/inputs/ping).

So there is any GAP or what should I check ?

ping

danielnelson commented 5 years ago

There should be an error printed in the logfile when this occurs, can you check the logfile and paste the error here.

bougui commented 4 years ago

Hello all,

sorry to ask this but how do you put more than 1 ip in the urls ? I have tested " " or "," with no success and the doc doest not give example about this.

Thanks

danielnelson commented 4 years ago

Here is an example, the configuration is written using TOML so you may want to reference https://github.com/toml-lang/toml#user-content-array for more details on allowed syntax.

[[inputs.ping]]
   urls = [
       "debian-stretch-docker-1.virt",
       "debian-stretch-docker-2.virt",
       "debian-stretch-docker-3.virt",
       "debian-stretch-docker-4.virt",
   ]
bougui commented 4 years ago

Thanks @danielnelson that would be great to add to the doc so other like me don't ask.

Bye

hemna commented 4 years ago

When using the native method you will need to add additional permissions to Telegraf:

sudo setcap cap_net_raw=eip /path/to/telegraf

That fixed the error logs for me thanks!

chuflo326 commented 1 year ago

Hola, quiero usar [[inputs.ping]] pero las urls quiero que sea una variable y traerlas de mi anterioir consulta snmp, que actualmente las estoy trayendo con este oid oid = ".1.3.6.1.4.1.2011.6.128.1.1.2.49.1.2" esto se puede realizar?