Gowee / traceroute-map-panel

πŸ“πŸ—ΊοΈ Visualize traceroute paths on a map in a Grafana panel
Apache License 2.0
37 stars 2 forks source link

Can't store data into influxdb #8

Open 4thdi opened 3 years ago

4thdi commented 3 years ago

The telegraf config: [[inputs.exec]] commands=["mtr -C -n 192.168.1.2"] timeout = "40s" data_format = "csv" csv_skip_rows = 1 csv_column_names=[ "", "", "status","dest","hop","ip","loss","snt","", "","avg","best","worst","stdev"] name_override = "mtr" csv_tag_columns = ["dest", "hop", "ip"]

[[outputs.influxdb]] urls = ["http://127.0.0.1:8086"] database = "my_telegraf3" timeout = "5s"

telegraf config file testing pass, but no mtr information display: telegraf --config /etc/telegraf/telegraf.conf --test

influxdb database creased normally, but no mtr measurement: Using database my_telegraf3

show measurements name: measurements name

cpu disk diskio kernel mem processes swap system

Please help to resovle this problem , what should I do? Many Thanks.

Gowee commented 3 years ago

Hi, 4thdi.

The config looks good.

Is telegraf restarted after the config is updated? Does the log of telegraf tell anything related? And, what is the output of executing mtr -C -n 192.168.1.2 in a shell?

BTW, I would suggest adding an interval:

...
commands=["mtr -C -n 192.168.1.2"]
interval = "5m" # or 1m / 2m, whatever
timeout = "40s"
...

, even though it should be irrelevant to this problem.

4thdi commented 3 years ago

Thanks for your answer, it seems not working. I have edited the config as you giving, and restart the telegraf, there is nothing in influxdb database.

Whether the problem is related to this configuration: "data_format"

[[inputs.exec]] 
data_format = "csv"

The command result "mtr -C -n example.org":

[root@ZABBAX-NET-TEST ~]# mtr -C -n example.org
MTR.0.85;1616420843;OK;example.org;1;172.17.96.2;632
MTR.0.85;1616420843;OK;example.org;2;172.16.1.42;896
MTR.0.85;1616420843;OK;example.org;3;172.16.1.17;388
MTR.0.85;1616420843;OK;example.org;4;172.16.1.130;1072
MTR.0.85;1616420843;OK;example.org;5;172.16.1.121;843
MTR.0.85;1616420843;OK;example.org;6;58.211.217.182;3726
MTR.0.85;1616420843;OK;example.org;7;222.92.188.24;10308
MTR.0.85;1616420843;OK;example.org;8;???;0
MTR.0.85;1616420843;OK;example.org;9;58.208.77.9;4634
MTR.0.85;1616420843;OK;example.org;10;202.97.84.25;10063
MTR.0.85;1616420843;OK;example.org;11;202.97.57.229;55898
MTR.0.85;1616420843;OK;example.org;12;202.97.12.210;9046
MTR.0.85;1616420843;OK;example.org;13;202.97.52.250;177711
MTR.0.85;1616420843;OK;example.org;14;202.97.86.177;159500
MTR.0.85;1616420843;OK;example.org;15;129.250.9.73;144708
MTR.0.85;1616420843;OK;example.org;16;129.250.2.49;154472
MTR.0.85;1616420843;OK;example.org;17;129.250.3.27;144936
MTR.0.85;1616420843;OK;example.org;18;129.250.193.134;155188
MTR.0.85;1616420843;OK;example.org;19;152.195.84.133;156025
MTR.0.85;1616420843;OK;example.org;20;93.184.216.34;144944

The database query result:

[root@ZABBAX-NET-TEST ~]# influx
Connected to http://localhost:8086 version 1.8.4
InfluxDB shell version: 1.8.4
> show databases;
name: databases
name
----
telegraf
_internal
my_telegraf
my_telegraf1
my_telegraf2
my_telegraf3
my_telegraf5
my_telegraf6
> use my_telegraf6;
Using database my_telegraf6
> show measurements
> 
> 
> show field keys
> show measurements
> show field keys

The service status:

[root@ZABBAX-NET-TEST ~]# systemctl status telegraf
● telegraf.service - The plugin-driven server agent for reporting metrics into InfluxDB
   Loaded: loaded (/usr/lib/systemd/system/telegraf.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-03-22 22:12:00 CST; 2min 26s ago
     Docs: https://github.com/influxdata/telegraf
 Main PID: 18067 (telegraf)
   CGroup: /system.slice/telegraf.service
           └─18067 /usr/bin/telegraf -config /etc/telegraf/telegraf.conf -config-directory /etc/telegraf/telegraf.d

Mar 22 22:12:00 ZABBAX-NET-TEST systemd[1]: Started The plugin-driven server agent for reporting metrics into InfluxDB.
Mar 22 22:12:00 ZABBAX-NET-TEST systemd[1]: Starting The plugin-driven server agent for reporting metrics into InfluxDB...
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Starting Telegraf 1.18.0
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded inputs: exec
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded aggregators:
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded processors:
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Loaded outputs: influxdb
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! Tags enabled: host=ZABBAX-NET-TEST
Mar 22 22:12:00 ZABBAX-NET-TEST telegraf[18067]: 2021-03-22T14:12:00Z I! [agent] Config: Interval:10s, Quiet:false, Hostname:"ZABBAX-NET-TEST", Flush Interval:10s
[root@ZABBAX-NET-TEST ~]# systemctl status influxdb
● influxdb.service - InfluxDB is an open-source, distributed, time series database
   Loaded: loaded (/usr/lib/systemd/system/influxdb.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2021-03-22 21:48:04 CST; 26min ago
     Docs: https://docs.influxdata.com/influxdb/
 Main PID: 14146 (influxd)
   CGroup: /system.slice/influxdb.service
           └─14146 /usr/bin/influxd -config /etc/influxdb/influxdb.conf

Mar 22 22:13:45 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:13:45 +0800] "POST /query?chunked=true&db=&epoch=ns&q=show+databases%3B HTTP/1.1" 200 142 "-" "In...56b94472 764
Mar 22 22:13:51 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:13:51.840770Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SHOW DATABASES"
Mar 22 22:13:51 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:13:51 +0800] "POST /query?db=&epoch=ns&q=SHOW+DATABASES HTTP/1.1" 200 142 "-" "InfluxDBShell/1.8....56b94472 760
Mar 22 22:14:00 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:00 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=shoe+measurements HTTP/1.1" 400...56b94472 242
Mar 22 22:14:02 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:02 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=shoe+measurement HTTP/1.1" 400 ...56b94472 188
Mar 22 22:14:08 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:08 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+measurement HTTP/1.1" 400 ...56b94472 218
Mar 22 22:14:11 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:14:11.097313Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SHOW MEASUREMENTS ON my_telegraf6"
Mar 22 22:14:11 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:11 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+measurements HTTP/1.1" 200...56b94472 565
Mar 22 22:14:18 ZABBAX-NET-TEST influxd[14146]: ts=2021-03-22T14:14:18.150145Z lvl=info msg="Executing query" log_id=0T2YvqJl000 service=query query="SELECT fieldKey, fieldType FROM m...._fieldKeys"
Mar 22 22:14:18 ZABBAX-NET-TEST influxd[14146]: [httpd] 127.0.0.1 - - [22/Mar/2021:22:14:18 +0800] "POST /query?chunked=true&db=my_telegraf6&epoch=ns&q=show+field+keys HTTP/1.1" 200 5...6b94472 1168
Hint: Some lines were ellipsized, use -l to show in full.
Gowee commented 3 years ago

Whether the problem is related to this configuration: "data_format"

I think that is the problem. MTR.0.85 appears to indicate mtr v0.85 released in 2013, while the latest is v0.94 released in 2020. The output format turns out to change since then.

My output:

> mtr -C -n example.org
Mtr_Version,Start_Time,Status,Host,Hop,Ip,Loss%,Snt, ,Last,Avg,Best,Wrst,StDev,
MTR.0.94,1616423213,OK,example.org,1,???,100.00,10,10,0.00,0.00,0.00,0.00,0.00
MTR.0.94,1616423213,OK,example.org,2,<REDACTED>,0.00,10,0,0.73,0.86,0.71,1.26,0.17
MTR.0.94,1616423213,OK,example.org,3,<REDACTED>,0.00,10,0,3.52,3.63,1.01,17.61,5.09
MTR.0.94,1616423213,OK,example.org,4,<REDACTED>,0.00,10,0,3.21,5.86,1.07,34.33,10.56
MTR.0.94,1616423213,OK,example.org,5,<REDACTED>,0.00,10,0,1.47,1.83,1.02,6.65,1.71
MTR.0.94,1616423213,OK,example.org,6,2001:504:13::210:112,0.00,10,0,1.19,1.66,1.17,4.55,1.03
MTR.0.94,1616423213,OK,example.org,7,2606:2800:4062:f::a,0.00,10,0,0.89,0.70,0.60,0.95,0.12
MTR.0.94,1616423213,OK,example.org,8,2606:2800:220:1:248:1893:25c8:1946,0.00,10,0,0.75,0.65,0.37,0.96,0.17

Try:

Their columns (csv_column_names) also differ somewhat. But I suppose it is fine because the several important fields are the same.

I don't have a mtr v0.85 to test. Please let me know if you find that still not working.

4thdi commented 3 years ago

Thanks a lot ! As you suggested, it's working now. Another question : How could I configure the panel options for GeoIP when I use it to monitor LAN network IP address? Source IP and destination IP address are LAN IP, it means I must manually define the GPS location.

Solution: Install MTR 0.94 manually(not by yum), then use it in inputs.exec

[[inputs.exec]]
  commands=["/tmp/mtr-0.94/mtr -C -n example.org"]
  interval = "60s"
  timeout = "120s"
  data_format = "csv"
  csv_skip_rows = 1
#  csv_delimiter = ";"
  csv_column_names=[ "", "", "status","dest","hop","ip","loss","snt","", "","avg","best","worst","stdev"]
  name_override = "mtr"
  csv_tag_columns = ["dest", "hop", "ip"]

Modify the output.influxdb timeout value:

[[outputs.influxdb]]
timeout = "120s"
Gowee commented 3 years ago

I am glad to hear that.

There is basically no support to allow specifying geolocation data manually for now. I am willing to learn about the actual scenario you are facing. Could you please elaborate on your network architecture?

For example, are all the hops (i.e. machines) with LAN IPs geographically dispersed over different places and connected via some tunnels (e.g. VPN)? Is there a known Internet IP address for each hop (even though only LAN IP is available in traceroute)?

BTW, it is not necessary to increase timeout of outputs.influxdb too much unless errors when flushing data are observed as it is irrelevant with exec. And a too-large timeout may prevent Telegraf from reporting db errors in time. Besides, in inputs.exec, I suppose timeout should be smaller than interval in general (not sure). Anyway, you do not have to change those until you see actual errors.

4thdi commented 3 years ago

Thanks for your prompt reply.

There are three sites in my network topology, they are in different city, the IP address are LAN. I need to monitor the network status between every two sites, the network exist VPN tunnel, and I can give the geographic coordinates of each city.

Now the require is:

  1. Mark the city of each site on the map;
  2. Show the network status through line status between each sites;

The MRT result as follows:

[root@ZABBAX-NET-TEST mtr-0.94]# /tmp/mtr-0.94/mtr -C -n 172.19.1.10
Mtr_Version,Start_Time,Status,Host,Hop,Ip,Loss%,Snt, ,Last,Avg,Best,Wrst,StDev,
MTR.0.94,1616463849,OK,172.19.1.10,1,172.17.55.2,0.00,10,0,0.62,0.75,0.55,1.53,0.29
MTR.0.94,1616463849,OK,172.19.1.10,2,172.16.1.39,0.00,10,0,0.84,0.99,0.73,2.46,0.54
MTR.0.94,1616463849,OK,172.19.1.10,3,172.16.1.233,0.00,10,0,1.00,1.00,0.87,1.18,0.09
MTR.0.94,1616463849,OK,172.19.1.10,4,172.16.7.26,0.00,10,0,18.23,10.57,9.10,18.23,2.74
MTR.0.94,1616463849,OK,172.19.1.10,5,172.19.1.10,0.00,10,0,13.71,14.87,13.12,19.61,1.84

Hopefully this feature will be included in future releases, perhaps these requirements are out of the plug-in's design.

Gowee commented 3 years ago

Thanks for your feedback! The scenario sounds pretty reasonable. But there are some problems with the current implementation.

give the geographic coordinates of each city.

Mark the city of each site on the map;

Setting up a custom GeoIP API could be the most feasible way.

For example, on a serverless platform like Cloudflare Workers, a script can be as easy as: https://github.com/Gowee/traceroute-map-panel/blob/1d7b8d00d434f18fec1ea49526118cb35122a99a/ipip-cfworker.js. It gives full flexibility over geolocation data definition and takes no maintenance effort. If I have time, I would like to complement the example script later.

But due to the lack of source host IP in mtr output, the panel now resolves hostnames (host in data entry) to be the IP of an initial hop via a hardcoded DNS-over-HTTPS API. That is, it is not possible to control such hostname resolution for now. This problem would be easy to fix, though. It won't trouble if all of your hostnames are just IPs.

Show the network status through line status between each sites

Actually, line status does NOT indicate network status, at least for now. It is just a line connecting hops on a traceroute path, with fixed speed (assuming animation is turned on) and style. I was considering making line animations vary to reflect the network status (i.e. rtt and loss). But there is no concrete plan so far.

The hurdle is about both design (changing animation speed and dash length according to RTT and loss?) and implementation difficulty (say, it is somewhat infeasible to set different speeds for lines between every two hops while keeping animations smooth).

cddisk2000 commented 3 years ago

Use Centos 7 Always Fail The Main Points Are As Follows

  1. Centos 8 Steam or Centos 8
  2. The MTR was above 0.94 Offline Install , cannot be used yum I hope Can Help Everyone
Gowee commented 3 years ago

@cddisk2000

Hi.

I cannot reproduce the problem on a newly installed CentOS Linux release 8.4.2105. What is the version of telegraf and influxdb? Also, I am not sure what do you mean by "Always Fail". Is the telegraf failing to start? Or no data is written to influxdb?

Would you mind opening an issue where more information can be attached?

cddisk2000 commented 3 years ago

@Gowee This is my system environment for your reference

  1. offline install mtr-0.94-2.hs.el8.x86_64.rpm
  2. Telegraf version 1.19.0
  3. InfluxDB version: 1.8.6
  4. CentOS Stream Release 8
  5. Please confirm that the format must be output using the following command mtr -C -n www.google.com

    Mtr_Version,Start_Time,Status,Host,Hop,Ip,Loss%,Snt, ,Last,Avg,Best,Wrst,StDev, MTR.0.94,1624880440,OK,www.google.com,1,192.168.8.254,0.00,10,0,0.24,0.29,0.19,0.41,0.08 MTR.0.94,1624880440,OK,www.google.com,2,???,100.00,10,10,0.00,0.00,0.00,0.00,0.00

    If it doesn't work, you can browse my blog,I hope you can succeed https://my-fish-it.blogspot.com/2021/06/ss-grafana-traceroute-map-panel.html