Open jkroepke opened 11 months ago
I'm very much in favour of moving to PDH, though I'm not currently in a position to implement and test this myself. Are you comfortable implementing this, if you have the time?
I play with PDH function today. However I got different values back. It seems like that the perfcounter exporter mutate the values (e.g. from 100ns to 1sec).
Additionally, I have no clue, what the "secondValue" is. https://github.com/prometheus-community/windows_exporter/blob/470f5d58522fda17a9045c517f99654f29d55de5/pkg/perflib/perflib.go#L348
I can't find anything at the MS documentation and no clue, whats the source of windows_cpu_processor_rtc_total
is.
This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.
@breed808 I thought after the future use of PHD in favor registry based collectors.
Since we can't guarantee the exact metrics value between registry based collectors PHD, I had the idea the offer both source at once.
What did you think would be the best possible approach here?
Everyone else in the community is invited to provide feedback here.
I like the global switch option - something like --config.usePDH=true
which defaults to false. And if the switch is enabled the same metric-names and collectors are used but of course with PDH native functions instead of the reg calls.
At the moment, I do not really have an Idea, what the "best" way of implementing the PDH native function.
Currently, the exporter invoke perfdata once and collection can consume data from it. From code point of view, this is quite complex.
Alternately, each collector invokes PDH calls on they own, but again, no idea if there are any downside, if the exporter holds multiple handles to the Windows API.
At the end, invoking the PDH calls are really complex and it looks like there is no go library which make everything just developer friendly. And the code at telegraph looks very complex, https://github.com/influxdata/telegraf/blob/master/plugins/inputs/win_perf_counters/win_perf_counters.go and other implements need to validate, e.g. (https://github.com/elastic/beats/blob/main/metricbeat/helper/windows/pdh/pdh_windows.go)
At the end, it's an really, really time consuming task. It's hard to find an start here.
Yes I see, this is far from easy! You have already done a lot of work in the PR https://github.com/prometheus-community/windows_exporter/pull/1459 and it looks very good :)
I have tried it out an run in a issue (German System hehe)
Seems like windows does create the object names in the systems language.
panic: "windows_perfdata_prozessorinformationen_c3-übergänge_s" is not a valid metric name
goroutine 58 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
C:/Users/user01/go/pkg/mod/github.com/prometheus/client_golang@v1.19.1/prometheus/value.go:129
github.com/prometheus-community/windows_exporter/pkg/collector/perfdata.(*collector).collect(0xc0000ac400, 0xc0005903c0)
C:/Users/user01/source/repos/github/jk/windows_exporter/pkg/collector/perfdata/perfdata.go:199 +0x385
github.com/prometheus-community/windows_exporter/pkg/collector/perfdata.(*collector).Collect(0xc0000ac400, 0x0?, 0x0?)
C:/Users/user01/source/repos/github/jk/windows_exporter/pkg/collector/perfdata/perfdata.go:178 +0x1f
github.com/prometheus-community/windows_exporter/pkg/collector.(*Prometheus).execute(0xc0000942c0, {0xdf59e3, 0x8}, {0xf29e20, 0xc0000ac400}, 0xc000192238, 0xc0005903c0)
C:/Users/user01/source/repos/github/jk/windows_exporter/pkg/collector/prometheus.go:176 +0x8f
github.com/prometheus-community/windows_exporter/pkg/collector.(*Prometheus).Collect.func2({0xdf59e3, 0x8}, {0xf29e20?, 0xc0000ac400?})
C:/Users/user01/source/repos/github/jk/windows_exporter/pkg/collector/prometheus.go:117 +0xa5
created by github.com/prometheus-community/windows_exporter/pkg/collector.(*Prometheus).Collect in goroutine 56
C:/Users/user01/source/repos/github/jk/windows_exporter/pkg/collector/prometheus.go:115 +0x470
exit status 2
I guess, the hypen is the issue here. Prometheus has UTF-8 support. But good catch. I may have to take a look to keep data non-localized.
@DiniFarb could you please try out the lastest version of #1459 ? It's available here: https://github.com/prometheus-community/windows_exporter/actions/runs/10658016872/artifacts/1880022574
The functionally and options has been reduced. Instead using the implementation from telegraf, I build a own one.
While it's supporting less, the a bit easier to debug in case of issue. For example, wildcards at counters has been removed.
The counter values has been compared with the values from the other collectors and the values are equal. I would like to hear your feedback.
I did a quick smoke test on a german win11 system with:
PS C:\Users\*******\windows_exporter_binaries> .\windows_exporter-0.28.1-4-gdfc8d37-amd64.exe --log.level=debug --collectors.enabled="perfdata" --collector.perfdata.objects='[{"object":"Processor Information","instances":["*"],"counters": {"% Processor Time": {}}},{"object":"Memory","counters": {"Cache Faults/sec": {"type": "counter"}}}]'
ts=2024-09-02T07:47:21.713Z caller=exporter.go:147 level=debug msg="Logging has Started"
ts=2024-09-02T07:47:21.728Z caller=perfdata.go:97 level=warn msg="The perfdata collector is in an experimental state! The configuration may change in future. Please report any issues."
ts=2024-09-02T07:47:22.758Z caller=exporter.go:216 level=info msg="Running as *********"
ts=2024-09-02T07:47:22.759Z caller=exporter.go:223 level=info msg="Enabled collectors: perfdata"
ts=2024-09-02T07:47:22.759Z caller=exporter.go:258 level=info msg="Starting windows_exporter" version="(version=0.28.1-4-gdfc8d37, branch=HEAD, revision=dfc8d37dae0311bd2e2de503ed5c9efdd13069c4)"
ts=2024-09-02T07:47:22.759Z caller=exporter.go:259 level=info msg="Build context" build_context="(go=go1.22.6, platform=windows/amd64, user=runneradmin@fv-az1390-362, date=20240901-23:04:09, tags=unknown)"
ts=2024-09-02T07:47:22.759Z caller=exporter.go:260 level=debug msg="Go MAXPROCS" procs=16
ts=2024-09-02T07:47:22.759Z caller=tls_config.go:313 level=info msg="Listening on" address=[::]:9182
ts=2024-09-02T07:47:22.759Z caller=tls_config.go:316 level=info msg="TLS is disabled." http2=false address=[::]:9182
ts=2024-09-02T07:47:47.533Z caller=prometheus.go:191 level=debug msg="collector perfdata succeeded after 0.000000s."
looks good 👍 only thing was that the query took a bit long:
prefdata was ok:
windows_exporter_collector_duration_seconds{collector="perfdata"} 0
but the perflib snapshot was set with (even though I had just the perfdata collector active)
windows_exporter_perflib_snapshot_duration_seconds 18.1017429
As soon as I added another "classic" collector like --collectors.enabled="cpu,perfdata"
the responses were fast as always. I searched a little and saw that if no perflib collector is set - this func receives an empty string.
https://github.com/jkroepke/windows_exporter/blob/d8f0665bdc3f3c4d6e6119b1d2d7fa78c0931fa3/pkg/collector/collector.go#L209-L216
For testing I changed the function like:
diff --git a/pkg/collector/collector.go b/pkg/collector/collector.go
index e829ed5..90d7d22 100644
--- a/pkg/collector/collector.go
+++ b/pkg/collector/collector.go
@@ -209,6 +209,9 @@ func (c *Collectors) Build(logger log.Logger) error {
// PrepareScrapeContext creates a ScrapeContext to be used during a single scrape.
func (c *Collectors) PrepareScrapeContext() (*types.ScrapeContext, error) {
+ if c.perfCounterQuery == "" {
+ return nil, nil
+ }
objs, err := perflib.GetPerflibSnapshot(c.perfCounterQuery)
if err != nil {
return nil, err
and it worked fast as always. This was of course there already but maybe not recognized, cos why would you run the classic win exporter with no collectors. But now with the new perfdata collector it is different. I think that if only the perfdata collector is active all functionality of perflib should be disabled.
P.S. there is the possibility that I did configure something wrong - had not much time to look into it. I can manage some more time in the coming days if needed.
Normally, I develop this exporter on my windows 10 machine and I did not have any issues. But I can remember that other users hat the issue as well.
Ref: https://github.com/prometheus-community/windows_exporter/issues/1458
ah yes sure, there are non perflib collectors already - my mistake, what was I thinking. So yes this is a already existing issue and seems only to happen on win11.
I will plan this change on 0.30
Nice let me know if I can help in any way :)
@DiniFarb The generic perf counter collector will be part of the next release.
After that, current collectors will be switched to the new system.
In terms of the generic collector, think about your use-cases, testing testing testing.
I see that in the v0.30.0-beta.0 release you use an environment variable WINDOWS_EXPORTER_PERF_COUNTERS_ENGINE
to enable the feature. Any reason not use a command line flag like the other features?
I see that in the v0.30.0-beta.0 release you use an environment variable
WINDOWS_EXPORTER_PERF_COUNTERS_ENGINE
to enable the feature. Any reason not use a command line flag like the other features?
Good call, added in #1723
Current Progress:
Reading the documentation about PerfCounter
Microsoft highly recommends to fetch data via PHD function instead native registry. It seems like that PDH function are more performance.
We have issues like #724 and the exporter does not work well under load and the mentioned Zabbix exporter is using PDH functions, too. Also datadog and telegraf using PDF library.
Ref:
We should look into it, since the Registry supports V1 PerfCounter only, while the PDH functions support both V1 and V2.
Since telegraph is using the same OSS license, we should consider to use telegraf libraries as base instead starting from scratch. The OTEL collector is doing the same.