qemu: change CPU topology to support until 240 CPUs

devimc commented 6 years ago

The maximum number of CPUs per VM recommended by KVM is 240. To support this amount of CPUs, the CPU topology must change to 1 CPU 1 Socket, otherwise the VM will fail and print next error message:

qemu: max_cpus is too large. APIC ID of last CPU is N

fixes #597

Signed-off-by: Julio Montes julio.montes@intel.com

devimc commented 6 years ago

/cc @grahamwhaley @mcastelino @sameo

sameo commented 6 years ago

@grahamwhaley Can we confirm that this does not create other performance regressions?

grahamwhaley commented 6 years ago

yep. Working on an eval of qemu variants of threads/sockets/cores for metrics, also related to #512

mcastelino commented 6 years ago

@grahamwhaley we need to run other I/O tests particularly storage I/O. I do not expect any degradation in any other metric.

grahamwhaley commented 6 years ago

@mcastelino Sure. The key issue with the eval is that those are very 'noisy' tests - the results have quite a large variance, so trying to see if the change has any effect is difficult. I'm working on it (on how to get the results and/or evaluate of the tests more stable), but it's going to take time.

I believe you said before you had a specific tests where you saw visible degradation when we changed the topology - if you can remember any details then I'll happily take them :-)

grahamwhaley commented 6 years ago

I've run our metrics on this, and done some hand tweaking to refine the variability of a couple of the metrics so I can get a better view (work I will PR soon into the repo).

I'm not seeing any major network gains in my testing, or any other major gains. What I see is:

memory footprint is up a bit (1.8-2.5%)
qperf latency is up
iperf3 host<->container bandwidth looks a little down
fio randread looks the same

I think therefore at this point I'm going to nack this unless/until we can pinpoint and identify the use cases that gain.

I'm going to see if I can check some parallel network tests. I'll note that our current multi-stream tests look like they are under iperf3, which is a single threaded program, so maybe that is not the best test to use (iperf2 is multi-threaded to note).

Also, @mcastelino , note that my testing was on a single local machine - so the network transport tests here are purely a software test (pure linux software network stacks between the host and the container) - in that scenario afact we are basically CPU bound (as this is just pure sw) - so maybe the real gains we see are with a real hw inter-machine connection. wdyt?

devimc commented 6 years ago

commit message updated

grahamwhaley commented 6 years ago

@devimc Can I also check - we need to change the topology to enable 'hotplug CPUs' as well - as we cannot hotplug with the topology as it is today (without this PR) - is that correct?

Given we need this PR to move on with hotplug, as long as we accept there may be a minor drop in metrics measurements, then:

lgtm

sboeuf commented 6 years ago

LGTM

sboeuf commented 6 years ago

@sameo please merge it if you're fine with this !

sameo commented 6 years ago

It seems that there are no IO regressions. LGTM

grahamwhaley commented 6 years ago

Just to note, as I happened to read this in the last hour, at: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html-single/virtualization_tuning_and_optimization_guide/index in section 3.3.3 (cpu topology) I found:

Although your environment may dictate other requirements, selecting any desired number of sockets, but with only a single core and a single thread usually gives the best performance results.

and... no more details than that. Maybe one day we will get to the bottom of what effects topology might have on performance.

containers / virtcontainers

qemu: change CPU topology to support until 240 CPUs #591