New version stress-ng, most result of cpu-method test is lower than old version stress-ng

dnisqa commented 9 months ago

Yocto Stress-ng Build Configuration: BB_VERSION = "2.6.0" BUILD_SYS = "x86_64-linux" NATIVELSBSTRING = "universal" TARGET_SYS = "x86_64-poky-linux" MACHINE = "genericx86-64" DISTRO = "poky" DISTRO_VERSION = "4.3.3" TUNE_FEATURES = "m64 core2" TARGET_FPU = "" meta meta-poky meta-yocto-bsp = "nanbield:d3b27346c3a4a7ef7ec517e9d339d22bda74349d" meta-intel = "nanbield:8d633bd01e20e31c0dae58cf3cd41eddb2f712c7"

Stress-ng version of Yocto-4.3.3 Default version: 0.16.05 Manual compare version: 0.13.12

Test PC arch info: CPU: Intel Atom C3508 Memory size: 32G Memory speed: 1866

Stress-ng test cpu performance command: for m in cdouble crc16 fft int32float int64float \ ipv4checksum matrixprod parity queens sqrt union; do echo -e $m; \ stress-ng --cpu 0 --cpu-method $m --metrics-brief -t 60; done

Test result: Most CPU test result in stress-ng 0.16.05 is lower than stress-ng 0.13.12 at the same CPU and Test PC

cpu_stress_result_stress-ng0.13.12_kernel6.1.30_yocto4.3.3_atom3508.txt cpu_stress_result_stress-ng0.16.05_kernel6.1.30_yocto4.3.3_atom3508.txt

ColinIanKing commented 9 months ago

This is expected, as per the manual it states at the beginning:

"... stress-ng can also measure test throughput rates; this can be useful to observe performance changes across different operating system releases or types of hardware. However, it has never been intended to be used as a precise benchmark test suite, so do NOT use it in this manner."

and at the end it states:

"The bogo operations metrics may change with each release because of bug fixes to the code, new features, compiler optimisations, changes in support libraries or system call performance."

Generally bogo-ops ratings are useful when comparing the same version of stress-ng on different systems or different versions of libraries or compilers or kernels. Comparing results from different versions of stress-ng is not advisable.

Stress-ng is primarily used for inducing stress on a system. The bogo-ops feature is not useful when using different releases of stress-ng.

In this case I believe the bogo-ops for some stressors were re-calibrated to get a better mix of integer/floating point/vector/bitwise instructions, older releases were biased towards some forms of compute more than others and this skewed the overall bogo-ops results in favour of one type of instruction mix over another.

Also, the code has had bug fixes, some workarounds for overly optimized code, processor architecture optimizations, added verification checks and feature changes, for example:

For example, the following commit re-calibrated the bogo-ops:

commit 72da15f77dd89e6b34c97e39a5d8e7181727a5c2 Author: Colin Ian King colin.i.king@gmail.com Date: Wed May 11 23:25:52 2022 +0000

stress-cpu: normalized method cpu bogo ops for more consistent metrics

Scale bogo-ops rates for all cpu methods based on a reference Intel
i5-8350U processor. This provides less skewing and swamping of bogo-ops
by faster running methods.  This can be disabled and fall back to
the non-normalized method by using a new --cpu-old-metrics option.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

And the following were optimization changes:

commit 5e42939ee7cbc6c58f0bfaf616358d2648e55c7d Author: Colin Ian King colin.i.king@gmail.com Date: Fri Nov 10 18:02:41 2023 +0000

stress-cpu: replace 64 bit mwc and 7 shifts with 2 x 32 bit mwc and 6 shifts

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

commit 1843248bf2701538605a387e50e8ef9ad13baeb2 Author: Colin Ian King colin.i.king@gmail.com Date: Tue Feb 21 16:11:36 2023 +0000

stress-cpu: add one more loop unroll

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

commit cb7fdbddbbb1435da730ea5ccf02b04c7859d55c Author: Colin Ian King colin.i.king@gmail.com Date: Tue Feb 21 11:36:13 2023 +0000

stress-cpu: add some loop unrolling for some performance improvements

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

commit aabe43152bbb46a3561d933a65e52cf123b7baae Author: Colin Ian King colin.i.king@gmail.com Date: Sat Jan 7 22:22:41 2023 +0000

stress-cpu: use a clang builtin for reversing bits

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>

commit bd846c4756712b68f4a45040c049d048226ba665 Author: Colin Ian King colin.i.king@gmail.com Date: Thu Nov 24 12:56:25 2022 +0000

stress-cpu/vm: use builtin popcount where available

Where supported use builtin popcount for bit counting, compiles down
to popcnt on x86.

And the following added more CPU math features:

commit bad6224e8ec6758f52087355e853e35f1ec8ef70 Author: Colin Ian King colin.i.king@gmail.com Date: Tue Oct 24 16:04:03 2023 +0100

stress-*: use _Float128 or __float128 depending on what is available

And the following worked around compiler over-optimizations:

commit 8aa30d56d4d561a2a114894206c12f3156f5b307 Author: Colin Ian King colin.i.king@gmail.com Date: Wed Mar 8 13:19:01 2023 +0000

stress-cpu: collatz: force compiler to generate collatz computation

commit 90a0cc8ca9ddc0a9f275ec0355fd0b34386a0480 Author: Colin Ian King colin.i.king@gmail.com Date: Wed Mar 8 12:45:10 2023 +0000

stress-cpu: add some randomness to force code generation of computation

The optimizer does a good job on some compilers to compute the final
omega value rather than generating code to compute omega at run time.
Force code generation by adding a very small amount of randomness to
the initial omega start value.

And the following improved sanity checking:

commit 02bae17af7826b55f2c371290de845a82e0f51f0 Author: Anton Eliasson antone@axis.com Date: Mon Aug 10 15:00:47 2020 +0200

stress-cpu: Add verification to rand48

And the following were bug fixes:

commit 58e66fc9946a6e1c31d6916feee7dd13565c9552 Author: Colin Ian King colin.i.king@gmail.com Date: Tue Jan 10 14:32:13 2023 +0000

core-mwc: add stress_mwc*modn() functions for modulo'd range

There are numerous occasions in stress-ng where mwc results are being
modulo'd with a value to constrain the range. This however leads to
modulo bias and hence a non-uniform random number distribution. Fix
this by introducing modn helpers that perform the modulo without the
bias.  We scale the desired modulo to large size that fits into the
mwc type size and pick random values until they fit into the range
of 0..scaled_max - 1. Then we can take the modulo without any bias.

See https://research.kudelskisecurity.com/2020/07/28/the-definitive-guide-to-modulo-bias-and-how-to-avoid-it/

Kudos to Guilherme Janczak for spotting and reporting this issue.

Closes https://github.com/ColinIanKing/stress-ng/issues/252

...

dnisqa commented 9 months ago

Different versions of stress-ng cannot be compared, but the problems currently encountered With the upgrade of the system, kernel, and GCC and other packages, stress-ng 0.12.x cannot be compiled after yocto 4.3.3, errors will occur, and it is forced to upgrade to 0.13.x or 0.16.x

From stress-ng v0.12.x to 0.13.x, the problem of bogo-ops value reduction was discovered, but recently it was found that from 0.13.x to 0.16.x, the bogo-ops result dropped very significantly, on the same hardware

0.12.x -> 0.13.x Bogo-ops tested, 0.13.x result dropped by about 20% 0.13.x -> 0.16.x (in the same yocto 4.3.3, kernel 6.1.30, gcc13), the tested bogo-ops has many values 0.16.x is only about 10% of 0.13.x

Does stress-ng seems not suitable for CPU performance testing? Thanks

ColinIanKing commented 7 months ago

stress-ng is suitable for stressing a system, bogo-ops metrics are not comparable between versions, but this is documented and I think this is reasonable given that it's not intended to used primarily for benchmarking.

ColinIanKing / stress-ng

New version stress-ng, most result of cpu-method test is lower than old version stress-ng #363