Open project-poodle opened 7 years ago
Hi @zach929, this field should still be there. Rather than adding loss: false
to every good ping, pingbeat just adds loss: true
to failed pings (along with the reason
field which will eventually report what ICMP/network error was the cause). Are you no longer seeing loss where you previously saw it?
Hi @joshuar , thanks for the reply. The 'loss: true' event was not generated in 5.4 during my test. following is the pingbeat.yml:
pingbeat:
# Defines how often a ping is sent to a target
period: "5s"
# Whether to send pings over IPv4
useipv4: true
# Whether to send pings over IPv6
useipv6: false
# How long to wait for a target to respond to a ping request
timeout: "10s"
targets:
- name: "100.100.100.100"
- name: "8.8.8.8"
output:
console:
pretty: true
Following is the output of the program:
[es5]$ sudo /usr/bin/pingbeat -e -c pingbeat.yml -d publish
2017/04/30 23:36:02.094281 beat.go:285: INFO Home path: [/usr/bin] Config path: [/usr/bin] Data path: [/usr/bin/data] Logs path: [/usr/bin/logs]
2017/04/30 23:36:02.094362 beat.go:186: INFO Setup Beat: pingbeat; Version: 5.4.0
2017/04/30 23:36:02.094456 outputs.go:108: INFO Activated console as output plugin.
2017/04/30 23:36:02.094493 publish.go:238: DBG Create output worker
2017/04/30 23:36:02.094658 publish.go:280: DBG No output is defined to store the topology. The server fields might not be filled.
2017/04/30 23:36:02.094745 publish.go:295: INFO Publisher name: es5
2017/04/30 23:36:02.094920 metrics.go:23: INFO Metrics logging every 30s
2017/04/30 23:36:02.095120 async.go:63: INFO Flush Interval set to: 1s
2017/04/30 23:36:02.095153 async.go:64: INFO Max Bulk Size set to: 2048
2017/04/30 23:36:02.095175 async.go:72: DBG create bulk processing worker (interval=1s, bulk size=2048)
2017/04/30 23:36:02.095592 beat.go:221: INFO pingbeat start running.
2017/04/30 23:36:02.095616 pingbeat.go:71: INFO pingbeat is running! Hit CTRL-C to stop it.
2017/04/30 23:36:02.096336 pingbeat.go:97: INFO Using ip4:icmp connection
2017/04/30 23:36:07.098339 client.go:214: DBG Publish: {
"@timestamp": "2017-04-30T23:36:07.097Z",
"beat": {
"hostname": "es5",
"name": "es5",
"version": "5.4.0"
},
"rtt": 1.389245,
"target.addr": "8.8.8.8",
"target.name": "8.8.8.8",
"target.tags": null,
"type": "pingbeat"
}
2017/04/30 23:36:08.096155 output.go:109: DBG output worker: publish 1 events
{
"@timestamp": "2017-04-30T23:36:07.097Z",
"beat": {
"hostname": "es5",
"name": "es5",
"version": "5.4.0"
},
"rtt": 1.389245,
"target.addr": "8.8.8.8",
"target.name": "8.8.8.8",
"target.tags": null,
"type": "pingbeat"
}
2017/04/30 23:36:12.097673 client.go:214: DBG Publish: {
"@timestamp": "2017-04-30T23:36:12.097Z",
"beat": {
"hostname": "es5",
"name": "es5",
"version": "5.4.0"
},
"rtt": 1.210513,
"target.addr": "8.8.8.8",
"target.name": "8.8.8.8",
"target.tags": null,
"type": "pingbeat"
}
2017/04/30 23:36:13.095671 output.go:109: DBG output worker: publish 1 events
{
"@timestamp": "2017-04-30T23:36:12.097Z",
"beat": {
"hostname": "es5",
"name": "es5",
"version": "5.4.0"
},
"rtt": 1.210513,
"target.addr": "8.8.8.8",
"target.name": "8.8.8.8",
"target.tags": null,
"type": "pingbeat"
}
2017/04/30 23:36:17.098026 client.go:214: DBG Publish: {
"@timestamp": "2017-04-30T23:36:17.097Z",
"beat": {
"hostname": "es5",
"name": "es5",
"version": "5.4.0"
},
"rtt": 1.462431,
"target.addr": "8.8.8.8",
"target.name": "8.8.8.8",
"target.tags": null,
"type": "pingbeat"
}
100.100.100.100 is an obvious non-pingable address. In the output, only the 8.8.8.8 address generates ping event. The 'loss: true' event was not generated for '100.100.100.100'.
Hi @zach929 okay, the loss processing is still there, but some refactoring of the code meant that some "loss" conditions were no longer being recorded. With 4f9c249696fcc20b615e3cd0619d8a85e67456ad:
timeout
, it is (again) treated as loss.The default timeout is relatively low, (10 x interval) simply because I originally want to keep the memory usage low where a large number of targets was defined and a low interval was being used. This timeout
parameter can be set in the config as needed and I may opt for a higher timeout.
Can you try the master branch and see if it is better?
hi @joshuar, can you build a dev-release? I'm getting no 'loss: true' with the latest-Version
i'm getting a lot of errors for unreachable hosts with the latest release
2017/05/03 16:20:30.548142 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548147 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548151 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548457 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548510 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548521 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548531 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548541 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548560 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548570 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548580 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548589 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548599 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548611 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548620 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548629 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548638 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548649 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548660 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548670 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548680 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548690 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
2017/05/03 16:20:30.548707 pingbeat.go:180: ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout
I'm also experiencing a continuous stream of 'ERR Couldn't read from connection: read ip4 0.0.0.0: i/o timeout' when an IP is unreachable - it filled up 7 log files in a second.
Running at release v5.4.0
@atomicom @jegade @zach929 looks like I really made a mess of that last release. Can you try 5.4.1: https://github.com/joshuar/pingbeat/releases/tag/v5.4.1
This should fix both tracking of loss and also stop any unnecessary error messages.
@joshuar much better, now the losts are tracked. Thank you
in 1.0-beta, there is a 'loss:boolean' field that can capture packet loss. It seems this field is no longer present with 5.4.
This field is quite useful when detecting network instability. Could this field be added back?