redhat-performance / tuned

Tuning Profile Delivery Mechanism for Linux
GNU General Public License v2.0
840 stars 180 forks source link

hpc_compute profile failing on Red Hat Enterprise Linux CoreOS 410.84.202202012119-0 #412

Open ArangoGutierrez opened 2 years ago

ArangoGutierrez commented 2 years ago

On OpenShift when applying

apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
  name: openshift-node-hpc-compute
  namespace: openshift-cluster-node-tuning-operator
spec:
  profile:
  - data: |
      [main]
      summary=Custom OpenShift node profile for HPC compute workloads
      include=openshift-node,hpc-compute
    name: openshift-node-hpc-compute

  recommend:
  - machineConfigLabels:
      machineconfiguration.openshift.io/role: "compute"
    priority: 20
    profile: openshift-node-hpc-compute

to A cluster with the following config:

I get the following error

2022-02-04 15:45:37,756 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'vm.hugepages_treat_as_movable', the parameter does not exist
2022-02-04 15:45:37,756 ERROR    tuned.plugins.plugin_sysctl: sysctl option vm.hugepages_treat_as_movable will not be set, failed to read the original value.
2022-02-04 15:45:37,757 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2022-02-04 15:45:37,816 INFO     tuned.daemon.daemon: static tuning from profile 'openshift-node-hpc-compute' applied
I0204 15:45:37.825787    2323 controller.go:427] written "/etc/tuned/recommend.d/50-openshift.conf" to set TuneD profile openshift-node-hpc-compute
I0204 15:45:37.827545    2323 controller.go:901] updated Profile ip-10-0-129-72.ec2.internal stalld=<nil>, bootcmdline: 
I0204 15:45:38.001391    2323 controller.go:712] re-applying profile (openshift-node-hpc-compute) as the previous application ended with error(s)
I0204 15:45:38.001444    2323 controller.go:573] reloading tuned...
I0204 15:45:38.001456    2323 controller.go:576] sending HUP to PID 3489
E0204 15:45:38.001512    2323 controller.go:775] unable to sync(daemon/) requeued (0)
2022-02-04 15:45:38,001 INFO     tuned.daemon.daemon: stopping tuning
2022-02-04 15:45:38,024 INFO     tuned.daemon.daemon: terminating TuneD, rolling back all changes
2022-02-04 15:45:38,084 INFO     tuned.daemon.daemon: Running in automatic mode, checking what profile is recommended for your configuration.
2022-02-04 15:45:38,085 INFO     tuned.daemon.daemon: Using 'openshift-node-hpc-compute' profile
2022-02-04 15:45:38,086 INFO     tuned.profiles.loader: loading profile: openshift-node-hpc-compute
E0204 15:45:38.102694    2323 controller.go:775] unable to sync(daemon/) requeued (1)
2022-02-04 15:45:38,122 INFO     tuned.daemon.daemon: starting tuning
2022-02-04 15:45:38,127 INFO     tuned.plugins.base: instance cpu: assigning devices cpu4, cpu2, cpu6, cpu5, cpu0, cpu1, cpu7, cpu3
2022-02-04 15:45:38,128 INFO     tuned.plugins.plugin_cpu: We are running on an x86 GenuineIntel platform
2022-02-04 15:45:38,130 WARNING  tuned.plugins.plugin_cpu: your CPU doesn't support MSR_IA32_ENERGY_PERF_BIAS, ignoring CPU energy performance bias
2022-02-04 15:45:38,132 INFO     tuned.plugins.base: instance disk: assigning devices xvda
2022-02-04 15:45:38,134 INFO     tuned.plugins.base: instance net: assigning devices ens3
2022-02-04 15:45:38,137 INFO     tuned.plugins.plugin_cpu: setting new cpu latency 0
2022-02-04 15:45:38,140 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'vm.hugepages_treat_as_movable', the parameter does not exist
2022-02-04 15:45:38,140 ERROR    tuned.plugins.plugin_sysctl: sysctl option vm.hugepages_treat_as_movable will not be set, failed to read the original value.
2022-02-04 15:45:38,141 INFO     tuned.plugins.plugin_sysctl: reapplying system sysctl
2022-02-04 15:45:38,209 INFO     tuned.daemon.daemon: static tuning from profile 'openshift-node-hpc-compute' applied
jmencak commented 2 years ago

So this is not the "smallest" reproducer, but thank you for the report! Also, the correct name of the profile is hpc-compute.

@nealepetrillo , as you contributed this profile, do you have any updates for this profile for the current kernels? I.e. vm.hugepages_treat_as_movable is no longer present in modern kernels.

yarda commented 2 years ago

It seems the vm.hugepages_treat_as_movable knob has been dropped: https://lore.kernel.org/lkml/20171003072619.8654-1-mhocko@kernel.org/t/ It's not present in RHEL-8 and up.

I think we should drop it from the profile or we could set it conditionally (without the error).

The error:

2022-02-04 15:45:37,756 ERROR    tuned.plugins.plugin_sysctl: Failed to read sysctl parameter 'vm.hugepages_treat_as_movable', the parameter does not exist
2022-02-04 15:45:37,756 ERROR    tuned.plugins.plugin_sysctl: sysctl option vm.hugepages_treat_as_movable will not be set, failed to read the original value.

doesn't prevent the profile from loading, just the vm.hugepages_treat_as_movable sysctl knob is not set.

nealepetrillo commented 2 years ago

Howdy! I do not have updates for this profile yet. I'm working on getting a new maintainer to take a look in the next day or two. I'll update here when that happens. In the meantime, just removing that line from the profile should "solve" the error.