czerwonk / ping_exporter

Prometheus exporter for ICMP echo requests using https://github.com/digineo/go-ping
MIT License
524 stars 115 forks source link

ping_loss_ratio averaging concern/question. #93

Open Randommmm opened 12 months ago

Randommmm commented 12 months ago

Hi all,

I'm running a configuration which uses ping_exporter to query a range of services (primarily devices within our network and VoIP phone provider) for packet loss via the ping_loss_ratio metric.

We've had packet loss issues for a while now, although the data I am getting into Prometheus (and then Grafana) seems to be slightly incorrect.

I notice the packet loss numbers seems very simalir, for example, I only seem to be getting data values of 2.38%, 4.76% 7.14%, 9.52%, etc. These numbers are NOT random, they seem to be averaged somehow. For example, 2.38 2 = 4.76, 4.76 2 = 9.52, they seem to be multiples somehow. This data is represented within both Grafana and Promethues.

Screenshot 2023-11-06 at 9 59 43 AM Screenshot 2023-11-06 at 10 00 09 AM

I use the following settings within the prometheus.yml file.

- job_name: ping_exporter
  honor_timestamps: true
  scrape_interval: 1s
  scrape_timeout: 1s
  metrics_path: /metrics
  scheme: http
  follow_redirects: true
  enable_http2: true
  static_configs:
  - targets:
    - 10.20.60.11:9427

What could I be doing wrong to get somewhat malformed data?

Cheers, Randommmm,

rnyrnyrny commented 10 months ago

I was also confused about this. In my case the loss value is always 3.33%, 6.66% etc. Then I took a look at the code and found out it's related to history-size. Basically the loss value is calculated via lost packets / history size. I set history-size to 30 so when a packet is lost the loss ratio becomes 3.33%. Check the code here: https://github.com/digineo/go-ping/blob/bc50d4a3e1022217d5fb9b0c9399efed5b4b2261/monitor/history.go#L67C24-L67C24

evevseev commented 5 months ago

Does not seem to be in issue, as this values indeed should correlate with history-size.

@czerwonk, I guess this issues should be closed.