Closed back2root closed 3 years ago
I easily reproduced it using Telegraf 1.6.4.
I couldn't reproduce it using Telegraf 1.16.2 - seems that mentioned bug in github.com/shirou/gopsutil/cpu
was fixed here in v2.18.12
version. Currently Telegraf uses v2.20.9
Relevant telegraf.conf:
Not related to the error but only for ease of reproduction:
System info:
Affected operating System: Windows
Tested on:
Steps to reproduce:
Expected behavior:
Telegraf is running smothely and collecting every interval seconds CPU metrics.
Actual behavior:
From time to time Telegraf is throwing an error and isn't reporting CPU metrics for all CPU cores:
The more cpu cores you have (+ percpu = true) the more likely it is that the error is thrown.
Additional info:
Telegraf uses github.com/shirou/gopsutil/cpu to gather cpu metrics and expects that the returned values are used cpu time. How ever for Windows Plattform the used library already returns percentage values that the library itself gatherd via WMI. Thus later checks on the returned values fail as it is expected that the cpu time used may only rise on normal conditions. How ever the retruned cpu percent used will not follow this expectation. In addition later calculations of the cpu percent used makes no sence on percent values.
So for Windows plattform all the checks and calculation made by Telegraf using the variable lastStats are not needed/problematic.