konkor / cpufreq

System Monitor and Power Manager
https://konkor.github.io/cpufreq/
GNU General Public License v3.0
578 stars 60 forks source link

low latency performance settings #114

Open ayoungethan opened 5 years ago

ayoungethan commented 5 years ago

Low latency settings require predictable task scheduling that reduce or eliminate both latency and jitter as much as possible. Since power states and turbo boost cause spikes in latency and thus also increase jitter, this extension could be very useful in eliminating extra latency and jitter caused by certain settings.

Thus, even "conservative" settings can improve audio performance by reducing the amount of cpu scaling that occurs.

Rather than have specific "low latency" performance settings, having the option to reduce latency and jitter by linking the min-max frequency sliders exactly or by max % could help users create better settings for low latency audio-video production.

See for example https://shabiryusuf.wordpress.com/tag/intel-turbo-boost/

konkor commented 5 years ago

It's complex question. Such options depend on hardware (CPU Model/Architecture/Manufacturers Limitations etc). So they could not apply dynamically. They require Kernel's boot command line configurations for boot-loader. It means it requires a restarting at least. So grub2 is not one bootloader for Linux. Some settings could overheat CPU and you can get an opposite result with infinity latency while processor's cores will be cooling. So it needs very nice cooling and will work better on server's processors.

konkor commented 5 years ago

P-states designed are for reducing temperature/power consuming.

# cpupower idle-info
CPUidle driver: intel_idle
CPUidle governor: menu
analyzing CPU 0:

Number of idle states: 5
Available idle states: POLL C1-NHM C1E-NHM C3-NHM C6-NHM

POLL:
Flags/Description: CPUIDLE CORE POLL IDLE
Latency: 0
Usage: 1761
Duration: 3902427

C1-NHM:
Flags/Description: MWAIT 0x00
Latency: 3
Usage: 217236
Duration: 65744097

C1E-NHM:
Flags/Description: MWAIT 0x01
Latency: 10
Usage: 143876
Duration: 62857536

C3-NHM:
Flags/Description: MWAIT 0x10
Latency: 20
Usage: 788433
Duration: 620782145

C6-NHM:
Flags/Description: MWAIT 0x20
Latency: 200
Usage: 3872905
Duration: 13574453811

From the output of cpupower idle-info you could see CPU was full active just 3.9 seconds in the POLL state when lower C6-NHM state is about 4 hours.

So you can use _processor.maxcstate=0 or/and _intel_idle.maxcstate=0 kernel boot parameters to limit it to the first one - POLL state. But Linux Kernel will not allow to do so and will replace it with max_cstate=1 (idle states: POLL C1-NHM in my case) to make sure CPU will be not getting too hot. There is a hack to force POLL state anyway via idle=poll option. _You have to disable intelpstate driver so

To reduce temperature and avoid thermal throttle You should disable Turbo Boost and reduce Maximum Frequencies. You don't need MEGA GIGAHERTZ for audio processing

BE CAREFUL ON SMALL/ULTRA/MEGA/SLIM LAPTOPS THEY COOLING IS SO PATHETIC!!!

konkor commented 5 years ago

Real-time system is an system with a required minimal respond time in the computer science. So it's enough 20.8 microseconds or less for the audio processing at 48kHz (DVD quality) to consider my system as an real-time. intel_idle.max_cstate=3 (C3-NHM state) in my case.

ayoungethan commented 5 years ago

according to https://www.anandtech.com/show/10959/intel-launches-7th-generation-kaby-lake-i7-7700k-i5-7600k-i3-7350k/3 response times are still too slow: "a CPU can now reach peak frequency in 10-15 milliseconds rather than 30"

When low latency inputs create transient cpu loads within 2-10ms, this is still far too slow of a response time to prevent DSP or CPU overload. Thus, low latency throughput performance seems to depend heavily on minimum frequency.

konkor commented 5 years ago
  1. disable intel_pstste=disable
  2. select fixed acpi performance governor or I would preffer ondemand for daily work. Ondemand governor is the fastest dynamic ACPI governor. Actually, CPU will be using C-States on idle anyway.
  3. disable Turbo Boost and lower max frequency at least for one step down. It helps you to get more stability, less overneating/cooling time and it will be less P-States for switching.
  4. Up minimal frequency to make it more responsive
  5. Optionally, disable deep C-States for idle states as kernel boot options like in 1.
ayoungethan commented 5 years ago

Isn't it easier and more flexible to just set the min frequency to provide the desired DSP capacity in user space?

That way you don't need to change the governor from the generally-optimal powersave (analogous to acpi ondemand). You can just increase min. frequency for QoS.

Also, do you know of any desktop usage scenarios that actually benefit from decreasing maximum available frequency?

Increasing min. frequency reduces p-state switching and provides more stability to low latency throughput, but still allows the computer to scale based on non-low latency workloads, which could potentially protect against audio glitches.

konkor commented 5 years ago
  1. Increasing of Minimal Frequency is increasing overall system responsiveness.
  2. Powersave governor of intel_pstate or acpi driver? Generally, it's depends on your tasks and system configuration.
  3. Decreasing of Maximal Frequency is increasing and making more predictable system stability. I'm using fixed Powersave when I'm playing some games to example. So I'm disabling TB and reducing max frequency to avoid overheating and thermal throttling on long compilations like GNOME Shell, WebKit etc.

All depends what do you want to achieve, and what do you have to do so.

konkor commented 5 years ago

An interesting article for you https://www.pcworld.com/article/3173618/kaby-lake-is-unleashed-with-kernel-410.html#getting_around_the_problem There is the last part how to fix it Getting around the problem and they recommend just one thing to disable intel_pstate driver (intel_pstate=disable) in the bootloader. There are other article links I found there (https://unix.stackexchange.com/questions/513330/does-linux-kernel-support-intel-speed-shift)

ayoungethan commented 5 years ago

I mean powersave Intel_pstate policy.

I understand that theoretically lowering top frequencies can "avoid overheating and thermal throttling" and "increase and make more predictable system stability" but have not seen any impact in real world. I use pianoteq to guage performance as it is very CPU-dependent and has very nice CPU dsp load and performance monitoring tools. I find no overall performance difference, for example, in setting my CPU to min 2ghz - TB (about 65 performance index) vs 2ghz and no scaling (also about 65 in performance index). CPU still gives same performance of the lowest available frequency. Likewise, I also observe intel_pstate adjusts maximum frequency for thermal reasons automatically for longer tasks so get no benefit switching between for example "performance" governor and "powersave" during a lengthy encoding. The CPU already determines its maximum sustainable thermal envelope and will even over-ride a user's min setting if it causes too much heat. On my processor (i5 8265u in a system76 darter pro) this is about 2.3ghz at about 20*C ambient temperature, or about 1.5x TDP (1.6ghz "guaranteed"). So in practice, while boosting min CPU frequency improves low latency throughput for audio processing, I don't see any drawbacks to leaving it to CPU to calculate the max, and even potential benefit of allowing for performance capacity if a non-audio component has a transient increase in load, which the system can respond to fast enough because the increase in capacity doesn't need to meet low latency timeframes. This can potentially prevent CPU overloads by retaining headroom. The only drawback I see is if the user believes the additional dynamic headroom can be reliably used for low latency throuput, which it can't.

I have seen those references and appreciate this discussion, thank you!