Latency from logical disk collector

monismonther commented 3 years ago

Hi, first of all I would like to thank you for the great effort put into this exporter

We have a problem with disk latency metrics. How are these numbers supposed to be interpreted, when reading them from perfmon or directly from powershell, the numbers come up in milli or micro seconds , for example five samples show as follows

Avg. Disk sec/Read 0.004 Avg. Disk sec/Read 0.0006 Avg. Disk sec/Read 0.0008 Avg. Disk sec/Read 0.002 Avg. Disk sec/Read 0.005

Note that these are in seconds, so the numbers are 4 millisecond 600 micro, 800 micro, 2 milli and 5 milli respectively

However under grafana we use rate(windows_logical_disk_read_seconds_total[1m]) but get numbers that are in hundreds of milliseconds or 1 -2 seconds sometimes. Checking the source code you read the value and multiply it by the ticksToSecondsScaleFactor which is 1/ 1e7 , why do you do that? and how are these number coming like that

We think disk latency is very crucial and there should be some solution to this Are we missing something very obvious here? Hope someone can help

rob-scheepens commented 3 years ago

I suspect this is related to Frequency_Sys100NS, but I fail to see the relation here. It appears to me that to get ms instead of s, we should multiply by 1,000, not by 10,000,000. To get some additional understanding, I had a look at the raw perfdata counter on a 2019 Core system, but that confused me even more:

PS C:\Users\Administrator> Get-CimInstance Win32_PerfRawData_PerfDisk_PhysicalDisk | ? { $_.Name -like "1" } | ft Name,AvgDiskSecPerWrite

Name AvgDiskSecPerWrite
---- ------------------
1             617492156

PS C:\Users\Administrator> Get-CimInstance Win32_PerfRawData_PerfDisk_PhysicalDisk | ? { $_.Name -like "1" } | ft Name,AvgDiskSecPerWrite

Name AvgDiskSecPerWrite
---- ------------------
1             769101384

Surely these values cannot be seconds. In my case they should be close to 1ms (SSD disks). It looks to me that the author(s) of the collector knew of this issue with the raw data and mitigated it. Perhaps different Windows versions use different reporting metrics, despite the name still suggesting it in seconds.

@monismonther : what OS were you using to obtain the numbers you mention above?

Edit: note, to suit my own testing I created an experimental physical_disk collector, see https://github.com/rob-scheepens/windows_exporter/blob/master/collector/physical_disk.go.

rob-scheepens commented 3 years ago

I've created a support request with Microsoft to get clarification.

monismonther commented 3 years ago

@rob-scheepens

OS: windows server 2019 windows_exporter 0.15.0

Below is my reading from powershell Yesterday, the numbers below make a lot of sense as they are miliseconds and micro, this is a VM on local storage SSDs, the VMware graph shows from vcenter shows similar numbers as well

PS C:\Users\Administrator> Get-Counter '\LogicalDisk(c:)\Avg. Disk sec/Write' -MaxSamples 4 -SampleInterval 15

Timestamp CounterSamples

6/9/2021 12:33:23 PM \server_XXXXX\logicaldisk(c:)\avg. disk sec/write :
0.00288117430555556
6/9/2021 12:33:38 PM \server_XXXXX\logicaldisk(c:)\avg. disk sec/write :
0.000708748623853211
6/9/2021 12:33:53 PM \server_XXXXX\logicaldisk(c:)\avg. disk sec/write :
0.00109554532374101
6/9/2021 12:34:08 PM \server_XXXXX\logicaldisk(c:)\avg. disk sec/write :
0.000634233333333333

rob-scheepens commented 3 years ago

Ok, clearly Get-Counter does some manipulation as well. What do you see when you use the same Get-CimInstance command as in my example?

monismonther commented 3 years ago

OK, your line returned nothing in my case, I simply modified it such as this

PS C:\Users\Administrator> Get-CimInstance Win32_PerfRawData_PerfDisk_PhysicalDisk | ft Name,AvgDiskSecPerWrite

Name AvgDiskSecPerWrite

0 C: D: E: 4147169129 _Total 4147169129

and since I am actually checking LogicalDisk counters not physical, I changed it to LogicalDIsk and got this

PS C:\Users\Administrator> Get-CimInstance Win32_PerfRawData_PerfDisk_LogicalDisk | ft Name,AvgDiskSecPerWrite

Name AvgDiskSecPerWrite

HarddiskVolume1 21765 HarddiskVolume2 0 C: 204008595 HarddiskVolume5 24681 D: 1254089031 E: 602802530 _Total 2060946602

rob-scheepens commented 3 years ago

@monismonther : sorry, yes, obviously you were not using physical_disk yet. ;) Good to see you indeed got the same numbers that can never be seconds. Let me discuss with Microsoft and update here with results. Would be good though if you can also give my experimental physical_disk collector a test.

monismonther commented 3 years ago

@rob-scheepens is your physical_disk collector included in the binary build at your page?

I also noticed the following:

your code is quite similar to the LogicalDisk Collector, it also multiplies the disk latency collectors by the ticksToSecondsScaleFactor

Additionally, from the main collector.go file: " ticksToSecondsScaleFactor = 1 / 1e7 ", so the code is actually dividing by 1e7 or multiplying by 1e-7 not multiplying by 1e7 (10,000,000). which makes sense when converting CPU 100ns to 1 second for example 300ns or 3 CPU time ticks become 3 x 1e-7 or 0.0000003 of a second or 0.000000300 which is 3001e-9 or 300ns

Let me know if I am understanding this properly

rob-scheepens commented 3 years ago

Yes, your understanding seems correct. I am currently discussing with Microsoft, stay tuned.

@carlpett : can you shine some background on this? It appears to me that someone found out the same as I did, namely that the metric is not in seconds but in some large number, requiring the workaround of ticksToSecondsScaleFactor. It actually does not seem that the large value is ticks, as one tick is 15.6ms on standard Windows iirc.

Edit: perhaps the 10^-7 is related to Windows NT System Time epoch.

carlpett commented 3 years ago

Hm. I'm having a hard time remembering the specifics, but as I recall what the code should do is to divide by PerfTime (elapsed time since boot) in order to convert to actual seconds. There's some (not much) info about timers here: https://docs.microsoft.com/en-us/previous-versions/ms938538(v=msdn.10). Note Time Base: PerfFreq.

I'm not sure why we'd divide by the tick rate (which comes from Frequency_PerfTime, but as I recall it was always 1e7 on all systems we encountered, so at some point it was hardcoded), as it doesn't seem like that would be correct... I might very well be forgetting some important details, however.

But if you test with dividing by perf time instead, does the numbers line up?

monismonther commented 3 years ago

Hi,

Any updates regarding this matter?

rob-scheepens commented 3 years ago

Got this from Microsoft, to my quesion/comment why the counter is named AvgDiskSecPerWrite, when the value obtained is clearly not in seconds:

"The Win32_PerfRawData_PerfDisk_PhysicalDisk class , or any other from the Win32_PerfRawData contain RAW or uncooked performance data and you can’t query the values without calculating them, so it’s normal that you see such a value on your result. The formula is here PERF_AVERAGE_TIMER: Core Services | Microsoft Docs [docs.microsoft.com]

So, you’ll need to grab a first simple to serve as base and then make the calculation like :

(($n1.Timestamp_PerfTime - $n.Timestamp_PerfTime)/$n.Frequency_PerfTime)/($n1.AvgDisksecPerWrite_Base - $n.AvgDisksecPerWrite_Base)

Where $n is the base sample and $n1 is the second sample.

Elapsed time is equal to : (($n1.Timestamp_PerfTime - $n.Timestamp_PerfTime)/$n.Frequency_PerfTime) Number of operations (in this case Write Operations) equal to : ($n1.AvgDisksecPerWrite_Base - $n.AvgDisksecPerWrite_Base)

My strong advice here is use the Get-Counter, like you’ve already posted on the github thread. (Get-Counter -Counter "\PhysicalDisk(1)\Avg. Disk sec/Write").Countersamples.cookedvalue This makes all the work for you and gets you a value in seconds with an excellent granularity that can be very useful for the fastest drives we see in market these days (NvME’s and SCM’s) where millisecond granularity is not enough anymore."

hashi825 commented 3 years ago

I'm also seeing this issue, is this something that's fixable or is the solution to use something like Get-Counter with the textfile collector?

For what it's worth the telegraf agent from influxdata seems to be able to collect these counters but it uses PDH instead of Perflib for generating queries which probably does the same manipulation as Get-Counter

breed808 commented 3 years ago

I don't believe this is fixable without the exporter caching previous latency metrics for calculation. If I recall correctly caching metrics is generally discouraged in Prometheus exporters. With accurate latency metrics requiring cached previous results, this may warrant an exception.

It would be worth looking at PDH as an option. Get-Counter with the textfile collector would work in the short term, if accurate latency metrics are desired.

hashi825 commented 3 years ago

Only problem with using textfile collector or even something like telegraf for latency metrics is that there could be a slight desync between recorded values of other metrics at scrape time, which is why it would be pretty important to have it as part of the exporter. If using pdh it would mean the metric no longer becomes a cumulative gauge since it returns cooked samples

breed808 commented 3 years ago

Agreed, keeping the metric in the exporter is desirable.

If using pdh it would mean the metric no longer becomes a cumulative gauge since it returns cooked samples

Yes, the metrics would change from a Counter to a Gauge.

As I understand it, we have the following options:

Remove the latency metrics as they are inaccurate.
Remove latency metrics and recommend the use of Get-Counter for disk latency metrics.
Attempt to cache latency samples in logical_disk collector and perform calculations for latency metrics.
Use PDH or another metric source.

1 & 2 aren't desirable options for reasons already discussed, but at this stage I don't believe we can avoid the use of cooked samples, as the current raw samples aren't accurate.

hashi825 commented 3 years ago

From the Prometheus docs

If a metric is particularly expensive to retrieve, i.e. takes more than a minute, it is acceptable to cache it. This should be noted in the HELP string.

I don't think getting the data is particularly expensive from pdh in comparison to wmi, but this may be one of those exceptions. I assume physical disk latency counters would also be the same should a physical disk collector be implemented.

CatMerc commented 2 years ago

Hi, any update on the situation? Tried looking around but this appears to be the most recent discussion of this topic.

Caching a value doesn't seem that awful an option, though my understanding is probably naive.

rob-scheepens commented 2 years ago

@hashi825 @monismonther do you see the same latency with the physical_disk counters?

breed808 commented 2 years ago

@CatMerc I started work on using PDH as a source for these metrics, but haven't had time to continue the work.

If you're interested, I can push what I have so far so others can view and/or work on it.

CatMerc commented 2 years ago

@breed808 sure, I've been considering to find some time to enable this feature myself. If you already have something I can work off of that would help.

breed808 commented 2 years ago

@CatMerc apologies for the delayed response; I've managed to locate the PDH files I was working on last August, but can't determine the changes I made (no Git tracking :disappointed: )

You'll want to grab the contents of the win_perf_counters directory and use that as a base for the PDH source. I might be free to assist with integrating this into windows_exporter if you have any questions.

Sorry I can't be of more help with this.

CatMerc commented 2 years ago

@breed808 thanks! I'll have a look when I have the time :)

rob-scheepens commented 2 years ago

@CatMerc @breed808 : let's revive this old discussion as I'm seeing this again in testing. @CatMerc did you make any progress using PDH yet?

Gt3pccb commented 2 years ago

Isn't the problem with get counter is a lot slower when having to get a bunch of counters, if you do a Get-CimInstance Win32_PerfRawData_PerfDisk_LogicalDisk you get a bunch of very valuable counters.

rob-scheepens commented 2 years ago

@Gt3pccb : correct, see https://github.com/prometheus-community/windows_exporter/issues/799#issuecomment-886566205, where it is mentioned that using Get-Counter is undesirable.

See https://github.com/prometheus-community/windows_exporter/issues/799#issuecomment-858376472 and following few comments on using Get-CimInstance. I'm not a developer, but perhaps using PDH is easier than using direct WMI queries.

@CatMerc @hashi825 @monismonther @breed808 : thoughts?

Gt3pccb commented 2 years ago

@rob-scheepens I was not suggesting to use get-counter :) ; although I am not a developer I reckon the using Get-CimInstance is way more effective.

I can't find how would you use PDH from PowerShell.

these are some examples on how I use Get-CimInstance $fPhyDiskCounters = Get-CimInstance -ClassName Win32_PerfFormattedData_PerfDisk_LogicalDisk | Where-Object {$_.Name -ilike ($DriveLetter + ":")} $fRefsPerfCounters = Get-CimInstance -ClassName Win32_PerfRawData_Counters_ReFS | Where-Object {$_.Name -ilike ($DriveLetter + ":")}

$fRefsPerfCounters = Get-CimInstance -ClassName Win32_PerfFormattedData_Counters_ReFS | Where-Object {$_.Name -ilike ($DriveLetter + ":")}

the biggest problem I have is that I can't get latency from any of the Get-CimInstance conuters.

Get-CimInstance -Namespace root/CIMV2 -ClassName Win32_PerfFormattedData_PerfDisk_PhysicalDisk | Where-Object {$_.Name -ilike "*1*"}

Caption : Description : Name : 1 D: Frequency_Object : Frequency_PerfTime : Frequency_Sys100NS : Timestamp_Object : Timestamp_PerfTime : Timestamp_Sys100NS : AvgDiskBytesPerRead : 0 AvgDiskBytesPerTransfer : 136970 AvgDiskBytesPerWrite : 136970 AvgDiskQueueLength : 81 AvgDiskReadQueueLength : 0 AvgDisksecPerRead : 0 AvgDisksecPerTransfer : 0 AvgDisksecPerWrite : 0 AvgDiskWriteQueueLength : 81 CurrentDiskQueueLength : 107 DiskBytesPersec : 4623661679 DiskReadBytesPersec : 0 DiskReadsPersec : 0 DiskTransfersPersec : 33756 DiskWriteBytesPersec : 4623661679 DiskWritesPersec : 33756 PercentDiskReadTime : 0 PercentDiskTime : 8189 PercentDiskWriteTime : 8189 PercentIdleTime : 0 SplitIOPerSec : 16441 PSComputerName :

Get-CimInstance -Namespace root/CIMV2 -ClassName Win32_PerfRawData_PerfDisk_PhysicalDisk | Where-Object {$_.Name -ilike "*1*"}

Caption : Description : Name : 1 D: Frequency_Object : 0 Frequency_PerfTime : 10000000 Frequency_Sys100NS : 10000000 Timestamp_Object : 0 Timestamp_PerfTime : 728909414299 Timestamp_Sys100NS : 133101599173605355 AvgDiskBytesPerRead : 243150336 AvgDiskBytesPerRead_Base : 3538 AvgDiskBytesPerTransfer : 14210945256448 AvgDiskBytesPerTransfer_Base : 103874923 AvgDiskBytesPerWrite : 14210702106112 AvgDiskBytesPerWrite_Base : 103871385 AvgDiskQueueLength : 2648718937311 AvgDiskReadQueueLength : 8736542 AvgDisksecPerRead : 8736542 AvgDisksecPerRead_Base : 3538 AvgDisksecPerTransfer : 3019082975 AvgDisksecPerTransfer_Base : 103874923 AvgDisksecPerWrite : 3010346433 AvgDisksecPerWrite_Base : 103871385 AvgDiskWriteQueueLength : 2648710200769 CurrentDiskQueueLength : 100 DiskBytesPersec : 14210945256448 DiskReadBytesPersec : 243150336 DiskReadsPersec : 3538 DiskTransfersPersec : 103874923 DiskWriteBytesPersec : 14210702106112 DiskWritesPersec : 103871385 PercentDiskReadTime : 8736542 PercentDiskReadTime_Base : 133101599173605355 PercentDiskTime : 2648718937311 PercentDiskTime_Base : 133101599173605355 PercentDiskWriteTime : 2648710200769 PercentDiskWriteTime_Base : 133101599173605355 PercentIdleTime : 305587399 PercentIdleTime_Base : 133101599173605355 SplitIOPerSec : 50665688 PSComputerName :

rob-scheepens commented 2 years ago

@Gt3pccb : while you can use PDH from PowerShell, it is meant to be consumed from app code. Example of importing pdh.dll in PowerShell: https://powershellmagazine.com/2013/07/19/querying-performance-counters-from-powershell/

On the latencies: this is what I see on a PhysicalDisk when I started a write workload, I started seeing latency:

PS C:\> Get-CimInstance Win32_PerfRawData_PerfDisk_PhysicalDisk | ? { $_.Name -like "1" } |  ft Name,AvgDiskSecPer*

Name AvgDisksecPerRead AvgDisksecPerRead_Base AvgDisksecPerTransfer AvgDisksecPerTransfer_Base AvgDisksecPerWrite AvgDisksecPerWrite_Base
---- ----------------- ---------------------- --------------------- -------------------------- ------------------ -----------------------
1                    0                      0             114184627                       1636          114184627                    1636

For the System volume (C:):


Name AvgDisksecPerRead AvgDisksecPerRead_Base AvgDisksecPerTransfer AvgDisksecPerTransfer_Base AvgDisksecPerWrite AvgDisksecPerWrite_Base
---- ----------------- ---------------------- --------------------- -------------------------- ------------------ -----------------------
C:            10051863                    710              25823999                       1746           15772136                    1036

Your output also shows latencies (AvgDisksec* values) so I am not sure about your comment.

rob-scheepens commented 2 years ago

@breed808 @CatMerc : I imported win_perf_counters in https://github.com/rob-scheepens/windows_exporter/blob/rob-physical_disk-collector/collector/physical_disk.go#L13. What would be the next step to invoke PDH functionality from physical_disk.go?

cc: @carlpett

Gt3pccb commented 2 years ago

@rob-scheepens I was referring to formatted and of course I am using this to do this in a powershell script to monitor firmware latencies in hundreds of NvMEs during our stack development.

This is what I have come with ` Clear-Host $sampleTimeMilSec = 100

$Start = Get-CimInstance -Namespace root/CIMV2 -ClassName Win32_PerfRawData_PerfDiskPhysicalDisk | Where-Object {$.Name -ilike "1"}

start-sleep -Milliseconds $sampleTimeMilSec

$end=Get-CimInstance -Namespace root/CIMV2 -ClassName Win32_PerfRawData_PerfDiskPhysicalDisk | Where-Object {$.Name -ilike "1"}

$AvgDiskMicrosecPerTransfer = ((($end.AvgDisksecPerTransfer - $start.AvgDisksecPerTransfer))/(($end.AvgDisksecPerTransfer_Base - $start.AvgDisksecPerTransfer_Base))) / ((($end.Frequency_Sys100NS/1000000)-($sampleTimeMilSec/1000))) $AvgDiskMicrosecPerTransferF = "{0:N3}" -f $AvgDiskMicrosecPerTransfer

$AvgDisksecPerTransfer = ((($end.AvgDisksecPerTransfer - $start.AvgDisksecPerTransfer))/(($end.AvgDisksecPerTransfer_Base - $start.AvgDisksecPerTransfer_Base))) / ((($end.Frequency_Sys100NS)-($sampleTimeMilSec/1000)))

$AvgDisksecPerTransferF = "{0:N3}" -f $AvgDisksecPerTransfer

$AvgDiskQueueLength = "{0:N3}" -f (($end.AvgDiskQueueLength - $start.AvgDiskQueueLength)/(($end.Frequency_Sys100NS)*($sampleTimeMilSec/1000)))

$DiskBytesPersec = $end.DiskBytesPersec - $start.DiskBytesPersec

$DiskBytesPersecGB = [math]::round((($DiskBytesPersec/1GB)/($sampleTimeMilSec/1000)),3)

$CurrentDiskQueueLength = "{0:N3}" -f (($end.CurrentDiskQueueLength + $start.CurrentDiskQueueLength)/2)

If ($AvgDiskMicrosecPerTransfer -gt 0) { Write-Host "AvgDiskMicrosecPerTransfer = $AvgDiskMicrosecPerTransferf" Write-Host "AvgDisksecPerTransfer = $AvgDisksecPerTransferf" } Write-Host "AvgDiskQueueLength = $AvgDiskQueueLength" Write-Host "CurrentDiskQueueLength = $CurrentDiskQueueLength" Write-Host "DiskBytesPersecGB = $DiskBytesPersecGB"`

breed808 commented 2 years ago

@rob-scheepens have a look at the functions available in https://github.com/influxdata/telegraf/blob/master/plugins/inputs/win_perf_counters/pdh.go It may be possible to wrap some of these PDH functions for use in the collect function of your physical_disk collector.

rob-scheepens commented 2 years ago

@breed808 I am looking at how this was done in https://github.com/juyaojia/opentelemetry-yjbranch/tree/a60f6beefdb11e356d7c052f7509cad46d8228a2/receiver/windowsperfcountersreceiver, as that gives an example of an exporter created with PDH. Note: I've created a quick test using PDH in Rust on https://github.com/rob-scheepens/windows-disk-exporter, and that confirms that using PDH we get the right latencies back.

rob-scheepens commented 1 year ago

@breed808 : here is a quick sample of getting PDH wrapped by telegraf in some package:

package main

import (
    "fmt"
    "time"

    "github.com/influxdata/telegraf/plugins/inputs/win_perf_counters"
)

func main() {

    var query win_perf_counters.PerformanceQueryImpl

    query.Open()

    counter, _ := query.AddEnglishCounterToQuery("\\PhysicalDisk(1)\\Avg. Disk sec/Write")

    for {
        time.Sleep(time.Second)
        query.CollectData()
        //query.GetRawCounterValue(counter)
        fmt.Println(query.GetFormattedCounterValueDouble(counter))
    }

    query.Close()

}

Just trying to figure out how to transfer this into physical_disk.go for an experiment.

rob-scheepens commented 1 year ago

@breed808 : on https://github.com/rob-scheepens/windows_exporter/tree/rob-physical_disk-collector-pdh-lxnwin we've created a physical_disk collector that's using PDH to get the perf data. Can you take a look and give it a try?

cc: @monismonther @cbwest3-ntnx

cbwest3 commented 1 year ago

@monismonther, we found that scaling the latency by the number of IOPs gave us an average latency number we could use. See https://github.com/prometheus-community/windows_exporter/issues/1154#issuecomment-1540538857.

github-actions[bot] commented 11 months ago

This issue has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs.

prometheus-community / windows_exporter

Latency from logical disk collector #799