influxdata / telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
https://influxdata.com/telegraf
MIT License
14.52k stars 5.56k forks source link

inputs.intel_powerstat: Provide cpu core frequency estimation when scaling governor is intel_pstate and core is isolated #13830

Closed alysondeives closed 4 months ago

alysondeives commented 1 year ago

Use Case

There is an issue with intel_pstate where scaling_cur_freq is reported as 0 when nohz_full is enabled (i.e. the core is isolated). This behavior makes it difficult to calculate metrics that uses the scaling_cur_freq on the system and generates outliers.

Example: kernel params: nohz_full=1-19,21-39 isolcpus=nohz,domain,managed_irq, rcu_nocbs=1-19,21-39 kthread_cpus=0,20 irqaffinity=1-19,21-39

$ cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq 
3876895
$ cat /sys/devices/system/cpu/cpu1/cpufreq/scaling_cur_freq 
0
$ sudo cpupower -c 0,1 frequency-info 
Password: 
analyzing CPU 0:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 800 MHz - 4.10 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 4.10 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: 3.90 GHz (asserted by call to kernel)
  boost state support:
    Supported: yes
    Active: yes
analyzing CPU 1:
  driver: intel_pstate
  CPUs which run at the same hardware frequency: 1
  CPUs which need to have their frequency coordinated by software: 1
  maximum transition latency:  Cannot determine or is not supported.
  hardware limits: 800 MHz - 4.10 GHz
  available cpufreq governors: performance powersave
  current policy: frequency should be within 800 MHz and 4.10 GHz.
                  The governor "performance" may decide which speed to use
                  within this range.
  current CPU frequency: Unable to call hardware
  current CPU frequency: Unable to call to kernel
  boost state support:
    Supported: yes
    Active: yes

Output from telegraf metrics

powerstat_core_cpu_frequency_mhz{core_id="0",cpu_id="0",host="controller-0",package_id="0"} 3876.8
powerstat_core_cpu_frequency_mhz{core_id="1",cpu_id="1",host="controller-0",package_id="0"} 0

It would be great if intel_powerstat plugin could estimate the frequency of isolated cores. My naive approach on this situation is to set the cur_freq according to scaling_governor: if scaling_governor is powersave, use the values from scaling_min_freq, otherwise use scaling_max_freq. Is there any way to improve this estimation?

Expected behavior

powerstat_core_cpu_frequency_mhz metric should report a frequency value for isolated cores.

Actual behavior

powerstat_core_cpu_frequency_mhz metric reports a frequency with value 0 for isolated cores.

Additional info

No response

powersj commented 1 year ago

powerstat_core_cpu_frequency_mhz metric reports a frequency with value 0 for isolated cores.

@zak-pawel the current behavior when we cannot get the frequency seems correct. However, there is a second question about estimating the behavior. I am not sure defaulting to minimum or maximum makes sense either as that could equally throw off someone using these values.

Thoughts?

p-zak commented 1 year ago

the current behavior when we cannot get the frequency seems correct.

I agree with that. And it is not even that we cannot get the frequency, but (as @alysondeives wrote) scaling_cur_freq contains 0.

I am not sure defaulting to minimum or maximum makes sense either as that could equally throw off someone using these values.

And I also agree that defaulting to minimum or maximum may be misleading and in many cases wrong. There is a way to calculate current frequency using MSRs, but reading MSRs in more costly (for cores latency) than reading files from filesystem.

Need to think about it.

powersj commented 4 months ago

@alysondeives

Can you please file an upstream issue at: https://github.com/intel/powertelemetry and @p-zak can consider what to do here.

Thanks

telegraf-tiger[bot] commented 4 months ago

Hello! I am closing this issue due to inactivity. I hope you were able to resolve your problem, if not please try posting this question in our Community Slack or Community Forums or provide additional details in this issue and reqeust that it be re-opened. Thank you!