This patch adds additional optimization/tuning for kernel builds by adding more micro-architectures options accessible under:
Processor type and features --->
Processor family --->
The kernel uses its own set of CFLAGS, KCFLAGS. For example, see:
As pointed out by codemac in this topic, one can simply export the value/values for the KCFLAGS
and KCPPFLAGS
before calling make
to achieve the same result, see here.
export KCFLAGS=' -march=znver3 -mtune=znver3'
export KCPPFLAGS=' -march=znver3 -mtune=znver3'
make all
CPU Family | -march= | Min GCC Ver | Min Clang Ver |
---|---|---|---|
Native optimizations autodetected by GCC | native | 4.2 | 3.8 |
Generic 64-bit level v2 | x86-64-v2 | 11.1 | 12.0 |
Generic 64-bit level v3 | x86-64-v3 | 11.1 | 12.0 |
Generic 64-bit level v4 | x86-64-v4 | 11.1 | 12.0 |
AMD Improved K8-family | k8-sse3 | 9.3 | 9.0 |
AMD K10-family | amdfam10 | 9.3 | 9.0 |
AMD Family 10h (Barcelona) | barcelona | 9.3 | 9.0 |
AMD Family 14h (Bobcat) | btver1 | 9.3 | 9.0 |
AMD Family 16h (Jaguar) | btver2 | 9.3 | 9.0 |
AMD Family 15h (Bulldozer) | bdver1 | 9.3 | 9.0 |
AMD Family 15h (Piledriver) | bdver2 | 9.3 | 9.0 |
AMD Family 15h (Steamroller) | bdver3 | 9.3 | 9.0 |
AMD Family 15h (Excavator) | bdver4 | 9.3 | 9.0 |
AMD Family 17h (Zen) | znver1 | 9.3 | 9.0 |
AMD Family 17h (Zen 2) | znver2 | 9.3 | 9.0 |
AMD Family 19h (Zen 3) | znver3 | 10.3 | 12.0 |
AMD Family 19h (Zen 4) | znver4 | 13.0 | 17.0 |
AMD Family 19h (Zen 5) | znver5 | 14.1 | ??? |
Intel Bonnell family Atom | bonnell | 9.3 | 9.0 |
Intel Silvermont family Atom | silvermont | 9.3 | 9.0 |
Intel Goldmont family Atom (Apollo Lake and Denverton) | goldmont | 9.3 | 9.0 |
Intel Goldmont Plus family Atom (Gemini Lake) | goldmont-plus | 9.3 | 9.0 |
Intel 1st Gen Core i3/i5/i7-family (Nehalem) | nehalem | 9.3 | 9.0 |
Intel 1.5 Gen Core i3/i5/i7-family (Westmere) | westmere | 9.3 | 9.0 |
Intel 2nd Gen Core i3/i5/i7-family (Sandybridge) | sandybridge | 9.3 | 9.0 |
Intel 3rd Gen Core i3/i5/i7-family (Ivybridge) | ivybridge | 9.3 | 9.0 |
Intel 4th Gen Core i3/i5/i7-family (Haswell) | haswell | 9.3 | 9.0 |
Intel 5th Gen Core i3/i5/i7-family (Broadwell) | broadwell | 9.3 | 9.0 |
Intel 6th Gen Core i3/i5/i7-family (Skylake) | skylake | 9.3 | 9.0 |
Intel 6th Gen Core i7/i9-family (Skylake X) | skylake-avx512 | 9.3 | 9.0 |
Intel 8th Gen Core i3/i5/i7-family (Cannon Lake) | cannonlake | 9.3 | 9.0 |
Intel 10th Gen Core i7/i9-family (Ice Lake) | icelake-client | 9.3 | 9.0 |
Intel Xeon (Cascade Lake) | cascadelake | 10.2 | 10.0 |
Intel Xeon (Cooper Lake) | cooperlake | 10.2 | 10.0 |
Intel 3rd Gen 10nm++ i3/i5/i7/i9-family (Tiger Lake) | cooperlake | 10.2 | 10.0 |
Intel 4th Gen 10nm++ Xeon (Sapphire Rapids) | sapphirerapids | 11.1 | 12.0 |
Intel 11th Gen i3/i5/i7/i9-family (Rocket Lake) | rocketlake | 11.1 | 12.0 |
Intel 12th Gen i3/i5/i7/i9-family (Alder Lake) | alderlake | 11.1 | 12.0 |
Intel 13th Gen i3/i5/i7/i9-family (Raptor Lake) | raptorlake | 13.0 | 15.0.5 |
Intel 5th Gen 10nm++ Xeon (Emerald Rapids) | emeraldrapids | 13.0 | ??? |
Three different machines running a generic x86-64 kernel and an otherwise identical kernel running with the optimized gcc options were tested using a make based endpoint.
There are small but real speed increases to running with this patch as judged by a make endpoint. The increases are on par with the speed increase that the upstream sanctioned core2 option gives users, so not including additional options seems somewhat arbitrary to me.
Below are the differences in median values:
CPU | Difference in median value |
---|---|
core2 | +87.5 ms |
sandybridge | +79.7 ms |
ivybridge | +257.2 ms |
Find support for older version of the linux kernel and of gcc in the outdated_versions directory.