Open 4thdi opened 3 years ago
Hi, 4thdi.
The config looks good.
Is telegraf restarted after the config is updated? Does the log of telegraf tell anything related? And, what is the output of executing mtr -C -n 192.168.1.2
in a shell?
BTW, I would suggest adding an interval:
...
commands=["mtr -C -n 192.168.1.2"]
interval = "5m" # or 1m / 2m, whatever
timeout = "40s"
...
, even though it should be irrelevant to this problem.
Thanks for your answer, it seems not working. I have edited the config as you giving, and restart the telegraf, there is nothing in influxdb database.
Whether the problem is related to this configuration: "data_format"
[[inputs.exec]]
data_format = "csv"
The command result "mtr -C -n example.org":
[root@ZABBAX-NET-TEST ~]# mtr -C -n example.org
MTR.0.85;1616420843;OK;example.org;1;172.17.96.2;632
MTR.0.85;1616420843;OK;example.org;2;172.16.1.42;896
MTR.0.85;1616420843;OK;example.org;3;172.16.1.17;388
MTR.0.85;1616420843;OK;example.org;4;172.16.1.130;1072
MTR.0.85;1616420843;OK;example.org;5;172.16.1.121;843
MTR.0.85;1616420843;OK;example.org;6;58.211.217.182;3726
MTR.0.85;1616420843;OK;example.org;7;222.92.188.24;10308
MTR.0.85;1616420843;OK;example.org;8;???;0
MTR.0.85;1616420843;OK;example.org;9;58.208.77.9;4634
MTR.0.85;1616420843;OK;example.org;10;202.97.84.25;10063
MTR.0.85;1616420843;OK;example.org;11;202.97.57.229;55898
MTR.0.85;1616420843;OK;example.org;12;202.97.12.210;9046
MTR.0.85;1616420843;OK;example.org;13;202.97.52.250;177711
MTR.0.85;1616420843;OK;example.org;14;202.97.86.177;159500
MTR.0.85;1616420843;OK;example.org;15;129.250.9.73;144708
MTR.0.85;1616420843;OK;example.org;16;129.250.2.49;154472
MTR.0.85;1616420843;OK;example.org;17;129.250.3.27;144936
MTR.0.85;1616420843;OK;example.org;18;129.250.193.134;155188
MTR.0.85;1616420843;OK;example.org;19;152.195.84.133;156025
MTR.0.85;1616420843;OK;example.org;20;93.184.216.34;144944
The database query result:
[root@ZABBAX-NET-TEST ~]# influx
Connected to http://localhost:8086 version 1.8.4
InfluxDB shell version: 1.8.4
> show databases;
name: databases
name
----
telegraf
_internal
my_telegraf
my_telegraf1
my_telegraf2
my_telegraf3
my_telegraf5
my_telegraf6
> use my_telegraf6;
Using database my_telegraf6
> show measurements
>
>
> show field keys
> show measurements
> show field keys
The service status:
[root@ZABBAX-NET-TEST ~]# systemctl status telegraf
β telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
Loaded: loaded (/usr/lib/systemd/system/telegraf.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-03-22 22:12:00 CST; 2min 26s ago
Docs: https://github.com/influxdata/telegraf
Main PID: 18067 (telegraf)
CGroup: /system.slice/telegraf.service
ββ18067 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d
Mar 22 22:12:00 ZABBAX-NET-TEST systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Mar 22 22:12:00 ZABBAX-NET-TEST systemd[1]: Starting The plugin-driven server agent for reporting metrics into InfluxDB...
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Starting Telegraf 1.18.0
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded inputs: exec
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded aggregators:
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded processors:
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded outputs: influxdb
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Tags enabled: host=ZABBAX-NET-TEST
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ZABBAX-NET-TEST", Flush Interval:10s
[root@ZABBAX-NET-TEST ~]# systemctl status influxdb
β influxdb.service - InfluxDB is an open-source, distributed, time series database
Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2021-03-22 21:48:04 CST; 26min ago
Docs: https://docs.influxdata.com/influxdb/
Main PID: 14146 (influxd)
CGroup: /system.slice/influxdb.service
ββ14146 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
Mar 22 22:13:45 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:13:45 +0800] "POST /query?chunked=true&db=&epoch=ns&q=show+databases%3B HTTP/1.1" 200 142 "-" "In...56b94472 764
Mar 22 22:13:51 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:13:51.840770Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SHOW DATABASES"
Mar 22 22:13:51 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:13:51 +0800] "POST /query?db=&epoch=ns&q=SHOW+DATABASES HTTP/1.1" 200 142 "-" "InfluxDBShell/1.8....56b94472 760
Mar 22 22:14:00 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:00 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=shoe+measurements HTTP/1.1" 400...56b94472 242
Mar 22 22:14:02 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:02 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=shoe+measurement HTTP/1.1" 400 ...56b94472 188
Mar 22 22:14:08 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:08 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+measurement HTTP/1.1" 400 ...56b94472 218
Mar 22 22:14:11 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:14:11.097313Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SHOW MEASUREMENTS ON my_telegraf6"
Mar 22 22:14:11 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:11 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+measurements HTTP/1.1" 200...56b94472 565
Mar 22 22:14:18 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:14:18.150145Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SELECT fieldKey, fieldType FROM m...._fieldKeys"
Mar 22 22:14:18 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:18 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+field+keys HTTP/1.1" 200 5...6b94472 1168
Hint: Some lines were ellipsized, use -l to show in full.
Whether the problem is related to this configuration: "data_format"
I think that is the problem. MTR.0.85
appears to indicate mtr v0.85 released in 2013, while the latest is v0.94 released in 2020. The output format turns out to change since then.
My output:
> mtr -C -n example.org
Mtr_Version,Start_Time,Status,Host,Hop,Ip,Loss%,Snt, ,Last,Avg,Best,Wrst,StDev,
MTR.0.94,1616423213,OK,example.org,1,???,100.00,10,10,0.00,0.00,0.00,0.00,0.00
MTR.0.94,1616423213,OK,example.org,2,<REDACTED>,0.00,10,0,0.73,0.86,0.71,1.26,0.17
MTR.0.94,1616423213,OK,example.org,3,<REDACTED>,0.00,10,0,3.52,3.63,1.01,17.61,5.09
MTR.0.94,1616423213,OK,example.org,4,<REDACTED>,0.00,10,0,3.21,5.86,1.07,34.33,10.56
MTR.0.94,1616423213,OK,example.org,5,<REDACTED>,0.00,10,0,1.47,1.83,1.02,6.65,1.71
MTR.0.94,1616423213,OK,example.org,6,2001:504:13::210:112,0.00,10,0,1.19,1.66,1.17,4.55,1.03
MTR.0.94,1616423213,OK,example.org,7,2606:2800:4062:f::a,0.00,10,0,0.89,0.70,0.60,0.95,0.12
MTR.0.94,1616423213,OK,example.org,8,2606:2800:220:1:248:1893:25c8:1946,0.00,10,0,0.75,0.65,0.37,0.96,0.17
Try:
csv_skip_rows = 1
or change it to 0
csv_delimiter = ";"
Their columns (csv_column_names
) also differ somewhat. But I suppose it is fine because the several important fields are the same.
I don't have a mtr v0.85 to test. Please let me know if you find that still not working.
Thanks a lot ! As you suggested, it's working now. Another question : How could I configure the panel options for GeoIP when I use it to monitor LAN network IP address? Source IP and destination IP address are LAN IP, it means I must manually define the GPS location.
Solution: Install MTR 0.94 manually(not by yum), then use it in inputs.exec
[[inputs.exec]]
commands=["/tmp/mtr-0.94/mtr -C -n example.org"]
interval = "60s"
timeout = "120s"
data_format = "csv"
csv_skip_rows = 1
# csv_delimiter = ";"
csv_column_names=[ "", "", "status","dest","hop","ip","loss","snt","", "","avg","best","worst","stdev"]
name_override = "mtr"
csv_tag_columns = ["dest", "hop", "ip"]
Modify the output.influxdb timeout value:
[[outputs.influxdb]]
timeout = "120s"
I am glad to hear that.
There is basically no support to allow specifying geolocation data manually for now. I am willing to learn about the actual scenario you are facing. Could you please elaborate on your network architecture?
For example, are all the hops (i.e. machines) with LAN IPs geographically dispersed over different places and connected via some tunnels (e.g. VPN)? Is there a known Internet IP address for each hop (even though only LAN IP is available in traceroute)?
BTW, it is not necessary to increase timeout
of outputs.influxdb
too much unless errors when flushing data are observed as it is irrelevant with exec
. And a too-large timeout may prevent Telegraf from reporting db errors in time. Besides, in inputs.exec
, I suppose timeout
should be smaller than interval
in general (not sure). Anyway, you do not have to change those until you see actual errors.
Thanks for your prompt reply.
There are three sites in my network topology, they are in different city, the IP address are LAN. I need to monitor the network status between every two sites, the network exist VPN tunnel, and I can give the geographic coordinates of each city.
Now the require is:
The MRT result as follows:
[root@ZABBAX-NET-TEST mtr-0.94]# /tmp/mtr-0.94/mtr -C -n 172.19.1.10
Mtr_Version,Start_Time,Status,Host,Hop,Ip,Loss%,Snt, ,Last,Avg,Best,Wrst,StDev,
MTR.0.94,1616463849,OK,172.19.1.10,1,172.17.55.2,0.00,10,0,0.62,0.75,0.55,1.53,0.29
MTR.0.94,1616463849,OK,172.19.1.10,2,172.16.1.39,0.00,10,0,0.84,0.99,0.73,2.46,0.54
MTR.0.94,1616463849,OK,172.19.1.10,3,172.16.1.233,0.00,10,0,1.00,1.00,0.87,1.18,0.09
MTR.0.94,1616463849,OK,172.19.1.10,4,172.16.7.26,0.00,10,0,18.23,10.57,9.10,18.23,2.74
MTR.0.94,1616463849,OK,172.19.1.10,5,172.19.1.10,0.00,10,0,13.71,14.87,13.12,19.61,1.84
Hopefully this feature will be included in future releases, perhaps these requirements are out of the plug-in's design.
Thanks for your feedback! The scenario sounds pretty reasonable. But there are some problems with the current implementation.
give the geographic coordinates of each city.
Mark the city of each site on the map;
Setting up a custom GeoIP API could be the most feasible way.
For example, on a serverless platform like Cloudflare Workers, a script can be as easy as: https://github.com/Gowee/traceroute-map-panel/blob/1d7b8d00d434f18fec1ea49526118cb35122a99a/ipip-cfworker.js. It gives full flexibility over geolocation data definition and takes no maintenance effort. If I have time, I would like to complement the example script later.
But due to the lack of source host IP in mtr output, the panel now resolves hostnames ( It won't trouble if all of your hostnames are just IPs.host
in data entry) to be the IP of an initial hop via a hardcoded DNS-over-HTTPS API. That is, it is not possible to control such hostname resolution for now. This problem would be easy to fix, though.
Show the network status through line status between each sites
Actually, line status does NOT indicate network status, at least for now. It is just a line connecting hops on a traceroute path, with fixed speed (assuming animation is turned on) and style. I was considering making line animations vary to reflect the network status (i.e. rtt
and loss
). But there is no concrete plan so far.
The hurdle is about both design (changing animation speed and dash length according to RTT and loss?) and implementation difficulty (say, it is somewhat infeasible to set different speeds for lines between every two hops while keeping animations smooth).
Use Centos 7 Always Fail The Main Points Are As Follows
@cddisk2000
Hi.
I cannot reproduce the problem on a newly installed CentOS Linux release 8.4.2105
. What is the version of telegraf and influxdb? Also, I am not sure what do you mean by "Always Fail". Is the telegraf failing to start? Or no data is written to influxdb?
Would you mind opening an issue where more information can be attached?
@Gowee This is my system environment for your reference
If it doesn't work, you can browse my blog,I hope you can succeed https://my-fish-it.blogspot.com/2021/06/ss-grafana-traceroute-map-panel.html
The telegraf config: [[inputs.exec]] commands=["mtr -C -n 192.168.1.2"] timeout = "40s" data_format = "csv" csv_skip_rows = 1 csv_column_names=[ "", "", "status","dest","hop","ip","loss","snt","", "","avg","best","worst","stdev"] name_override = "mtr" csv_tag_columns = ["dest", "hop", "ip"]
[[outputs.influxdb]] urls = ["http://127.0.0.1:8086"] database = "my_telegraf3" timeout = "5s"
telegraf config file testing pass, but no mtr information display: telegraf --config /etc/telegraf/telegraf.conf --test
influxdb database creased normally, but no mtr measurement: Using database my_telegraf3
Please help to resovle this problem , what should I do? Many Thanks.