Open modena01 opened 2 months ago
No, that is message is from an older version of the smokeping_prober. The message was removed when we added dynamic reload support.
Reported latency may be going up because the prober is being starved for CPU and unable to process response packets fast enough.
Thanks SuperQ, I have now updated to the latest version, here is an example of what happens when I went from 21 hosts, to around 100.
do I need to run multiple smokeping instances and split the hosts out per instance? Increasing the interval period does not seem to help.
I'm looking at needing hundreds (probably 500+) hosts to monitor...
How often do you ping per second and how many hosts? what packet size for icmp packet? How many CPU cores do you have?
I am pinging a few hundred (200-300 hosts) but with different intervals. some I ping every 200ms and others every 5s. I noticed that at the beginning the CPU load is higher than at later times - maybe the load is distributed. Running "top" I sometimes see smokeping_prober consume 1100% CPU and then other times only 300-500%.
The scrape interval of prometheus defines how the bucket lengt which means each buckt contains all ping results of the scrape interval. If you ping a host every 1s and scrape every 60s you have 60 results in that bucket. This may be "ok" for you but if you have some pings with high latency you do not know if they are at the beginning or the end or spread in the bucket,
So it depends on the use case. I scrape every 15s which contains at least 3 pings for the "every 5s ping" targets.
So back to yout question - I would check your CPU consumtion - maybe - if possible - just add a few more CPU cores and check how the behaviour changes.
Thanks for smokeping! I am a prometheus newb, so please bear with me. Smokeping was working fine for me at first with a single host, then I tried adding about 100 additional hosts to ping, and the reported ICMP latency went up significantly. I dropped back down to 21 hosts, and latency dropped, but not back to the same level as with 1 target host. is it correct config to have
Is the purpose of different (multiple) "hosts" section merely to have different variables such as interval and size, for different hosts? If smokeping is creating and tracking and reporting buckets to prometheus, is there a valid reason to scrape smokeping from prometheus any more often than say 1min?
My prometheus config is as yet very simple:
From the prometheus log, I see a message like this when I have a single ICMP target:
but with 21 targets I get:
so it is clearly dividing the number of targets into 1000ms, but I cannot find this in the smokeping code, so I guess it is prometheus doing this? I was looking at this trying to figure out why reported latency is going up higher and higher the more ICMP target hosts I add.
Thanks for your help.