Closed JDA88 closed 1 week ago
Hi @JDA88 thanks for reporting this! I have disabled the 0.29 as latest release for now.
Could you please check, if #1643 solves the issue?
Snapshot builds: https://github.com/prometheus-community/windows_exporter/actions/runs/11059551224/artifacts/1984426396
Thx for the fix, testing it right now!
While I'm at it, dont you think the collector textfile succeeded after
logs should be debug
and not info
consdering how verbose it is?
We always had the level to info
and log in eventlog, but with those messages it got flooded so we tuned it down to warn
.
It's more a reflexion than a request we are fine with the warn
level
On node_exporter, is debug too. I will adjust the level
A little early to be 100% sure but the metrics look stable now. Thanks for que quick fix
Current Behavior
Metrics are not stable with v0.29, if I refresh the
/metrics
page regulary I have metrics missing, same behaviour with Prometheus scraping it.Sometimes a metric completely disappear and sometime the metric is there but there is a "instance" missing. The funny thing is that it's ALWAYS the same member missing.
volume="L:"
that is missing.name="W32Time"
that is missing.After some more testing it look like everything is stable if I remove the
process
collector But not sure it it's the root cause or a conjonction and no idea why it would cause metrics outside his scope to flap.Those are all the metrics where I have observed the issue:
When this append all
windows_exporter_collector_success
are = 1 and allwindows_exporter_collector_timeout
= 0. No message in logs, 100% reproductible on multiple servers, nothing change after a service or computer restart.Expected Behavior
Like in the previous version we where using (v0.25.0) stable metrics visibles at every scrap
Steps To Reproduce
Environment
windows_exporter logs
Except the occasional log bellow nothing of interest
Even when there is missing metrics all the collectors return success
Anything else?
Maybe chang the log from
msg="collector os succeeded after 1.234ms"
tomsg="collector os succeeded after 1.234ms, resulting in xx different metrics with a total of yy lines"
We had to stop the deployement of 0.29.0 early until we find a solution, hard for me to test other version between 0.25 and 0.29 but tell me if it can help