intel / thermal_daemon

Thermal daemon for IA
GNU General Public License v2.0
539 stars 117 forks source link

Help with a straightforward configuration file #424

Closed rdiez closed 1 month ago

rdiez commented 10 months ago

I have 2 noisy computers:

I thought that this would be a rather common requirement, but I found it quite hard to lower fan noise levels under Linux. I came up with the following configuration file, which I would like to reuse as much as possible.

I have some questions left, marked with "TODO" notes, just search for TODO below. Could someone shed light on some of these points left? Or maybe suggest improvements. Many thanks in advance.

<?xml version="1.0"?>
<ThermalConfiguration>
  <Platform>
    <Name>My generic x86 thermal configuration</Name>

    <!-- An asterisk ('*')  matches any platform. -->
    <ProductName>*</ProductName>

    <!-- QUIET will only use passive cooling devices. For a CPU, that would be dynamic voltage and frequency scaling (DVFS).
         PERFORMANCE will only select active devices like fans. -->
    <Preference>QUIET</Preference>

    <ThermalZones>
      <ThermalZone>
        <Type>cpu</Type>
        <TripPoints>
          <TripPoint>

            <!-- You need to specify a <SensorType>. Otherwise, you will get this error in the thermald log:
                   [ERR]XML zone: invalid sensor type []
                 Look for the available sensor types in the zone dump for the 'cpu' zone, in the thermald log. -->
            <SensorType>x86_pkg_temp</SensorType>

            <!-- Temperature at which to take action in millicelsius. -->
            <Temperature>70000</Temperature>

            <!-- max/passive/active:
                 - active:  This trip point will enable passive cooling.
                 - passive: This trip point will enable active cooling.
                 - max: aggressively throttle to avoid reaching the given temperature.
                 TODO: Ask what happens if you specify 'active' but the cooling devices below are all passive.
                 TODO: Ask how the <Preference> above affects this setting, for example,
                       if you specify QUIET above but 'active' here. Does the whole 'TripPoint' get ignored then?
            -->
            <type>passive</type>

            <!-- SEQUENTIAL | PARALLEL. When a trip point temp is violated,
                 the number of cooling devices which can be activated.
                 SEQUENTIAL means exhaust first cooling device before trying next. -->
            <ControlType>SEQUENTIAL</ControlType>

            <!-- You need to specify what cooling devices to use. Otherwise, this trip point will have no effect.

                 TODO: Ask why thermald cannot just default to using all known cooling devices.
                       There is a default order to active cooling devices, which can be overriden
                       with configuration file thermal-cpu-cdev-order.xml , so that all cooling devices
                       are known one way or another.

                 Look for the available cooling device (cdev's) types in the zone dump for the 'cpu' zone, in the thermald log.
                 We could also use other cooling devices like 'rapl_controller', 'intel_pstate' or 'cpufreq',
                 TODO: I do not know yet which ones yield better results, or are preferable under certain circumstances.

                 I did some tests on my laptop with an Intel Core i5-8265U.
                 Cooling device 'intel_powerclamp' alone cannot keep the CPU cool enough, it overshoots
                 the temperature limit quite a bit when the CPU is running very hot (up to 20 degrees Celsius).
                   Apparently, it does not reduce the CPU frequency enough.
                 Cooling device 'Processor' reduces the CPU frequency further, and still overshoots the target temperature,
                   but not so much.
                   Both 'intel_powerclamp' and 'Processor' do not work much better than 'Processor' alone.
                 TODO: Ask how to accelerate thermald's reaction time, so that the temperatures are not overshot so much.
                       On my test systems, thermald is apparently using the polling mode with a 4 second interval (according to the log output).
                       Perhaps reducing it would make it react faster. Try for example with --poll-interval=1 .
                       But that does not seem to have any effect, probably because the temperature sensors
                       are delivering measurements asynchronously anyway.

                 Parameter 'influence' is relative. The higher, the more influence. If the highest influence 100 is,
                 then we can think of them as percentages.

                 I do not know yet what the 'SamplingPeriod' parameter means, or what unit it uses.
                 Whether the sampling period was set to 1 or 5 did not make much difference in my tests,
                 probably because the sensors my laptop had delivered measurements asynchronously.
                 Even with a 'SamplingPeriod' of 5, thermald adjusted the levels apparently every 4 seconds.
                 I asked a question about it here:
                   Question about <SamplingPeriod>
                   https://github.com/intel/thermal_daemon/issues/418

                 I do not know yet what <index> does, or what values it is allowed to have.
                 I asked a question about it here:
                   Question about <index>
                   https://github.com/intel/thermal_daemon/issues/419
            -->

            <CoolingDevice>
              <index>1</index>
              <type>intel_powerclamp</type>
              <influence>100</influence>
              <SamplingPeriod>5</SamplingPeriod>
            </CoolingDevice>

            <CoolingDevice>
              <index>2</index>
              <type>Processor</type>
              <influence>80</influence>
              <SamplingPeriod>5</SamplingPeriod>
            </CoolingDevice>

          </TripPoint>
        </TripPoints>
      </ThermalZone>
    </ThermalZones>
  </Platform>
</ThermalConfiguration>
spandruvada commented 8 months ago

You can try the example 1 from man thermal_conf.xml. Change CPU temperature to a lower value to reduce fan noise.

spandruvada commented 7 months ago

Do you have any sysfs entry, which user space can use to control fan? If yes, you can write a config file else not.