Add support to minimize noise from system

smarr commented 4 years ago

ReBench should steer the user more aggressively to a setup that is more likely to avoid measuring noise. Much of the currently observed noise seems a mistake on my part.

After Infinity got shutdown, I did not migrate and actually lost the scripts setting no_turbo and the performance settings on the current benchmark machine. An attempt to recreate the script is below. Another issue is that --without-nice creeped into the benchmark setting, probably because I didn't have sudo on the machine initially.

So, to make things less dependent on my remembering things, a PR should:

[x] be more aggressive about use of nice, i.e., document consequences (more noise, more invocations needed), and try whether nice can be used even when flag is given and output a warning when it's available.
[x] check @charig's branch whether we can lift any changes here: https://github.com/smarr/ReBench/compare/master...reactorlabs:PowerManagement
- [ ] set min/max frequency (/sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq, /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq, /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies) (not used, because it does not seem to force frequency on the system I am using)
[x] look at https://github.com/intel/CommsPowerManagement and see whether we can use/borrow things
- a tool to manipulate the various details in the sys file system
[x] check @vext01's Krun for whether simple things are worth borrowing (cc @ltratt), e.g.:
- [x] kernel sample rate is reduced to 1 second intervals
- [x] cset is used
- [ ] set a core's register to disable turbo: https://github.com/softdevteam/krun/blob/master/krun/platform.py#L1253 (not yet done, because I hope it's redundant with the no_turbo config)
[x] report problematic settings at beginning and end of run so they are visible to the user, and a reminder to myself that something is going wrong
[x] remove flag for not using nice and combine it into a setting which needs to be set in the configuration to disable attempts to reduce noise
[x] for artifacts, we need a "reviewer-friendly" mode, which gives them information about the expected quality of data, but, does not make them think things are broken
[x] the changes made should be stored as environmental details, and reported to ReBenchDB

set -x

echo Disable Turboboost
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

echo Current cpu governor settings
for i in {0..23}
do
  cat /sys/devices/system/cpu/cpu${i}/cpufreq/scaling_governor
done

for i in {0..23}
do
  #cpufreq-set -c $i -g powersave
  cpufreq-set -c $i -g performance
  # powersave is the original setting
done

TODOs

Further Notes

useful tool to monitor CPU: https://github.com/cyring/CoreFreq

ltratt commented 4 years ago

I feel compelled to say that Krun is mostly @vext01's work -- he deserves most of the credit!

smarr commented 4 years ago

Thanks, fixed :)

vext01 commented 4 years ago

Hi Stefan,

This all sounds good.

About cset -- in Krun this is used only when pinning is enabled. By default it is not.

The original plan was to pin the VM to the set of tickless cores, but for reasons that I can no longer remember, I couldn't get it to do the right thing.

One thing I do remember, is that isolcpus is broken: https://bugzilla.kernel.org/show_bug.cgi?id=116701

So if you do want to try and get pinning working, I'd ignore that, and use a "cset shield" instead.

If you get it working, I'd be interested to hear your findings :)

Cheers!

ltratt commented 4 years ago

The problem we observed with pinning is that some VMs (HotSpot I think?) can start N threads where N is the number of physical cores. If you restrict such VMs to N-1 cores you get very weird performance.

vext01 commented 4 years ago

That was it! Thanks for jolting my memory.

smarr commented 4 years ago

Yeah, right. Pinning for VMs doesn't generally work. Though, I thought that clearing out an area of cores and moving tasks to core 0 might be beneficial. Not sure it's worth the hassle though, especially since cset shield reports some 50 unmovable tasks on the benchmark machine I am using. It worked wonders on the Tile64Pro, but that was very different in many ways.

I have been running some small experiments the last few days, most impact had the missing nice -n-20 in combination with pinning the benchmark thread to a core. Essentially letting the thread do itself a pthread_setaffinity_np right before it runs the benchmark. This is the biggest win I could measure so far.

It wasn't an experiment where all things where tested independently but flipping no_turbo and the performance setting didn't have a measurable impact on the noise, while nice and thread affinity had. Though, I did not test thread affinity without no_turbo and performance, which likely doesn't work as nice.

And, this is all really just me getting old. It's sad to rediscover the things I already knew literally 10 years ago: https://github.com/smarr/RoarVM/blame/2f93789b6ecd16533eb6f78d448373af9fae68fe/vm/src/platform/posix_os_interface.cpp#L57

🤦

smarr / ReBench

Add support to minimize noise from system #140

TODOs

Further Notes