rapido-linux / rapido

Quickly test Linux kernel changes
GNU Lesser General Public License v2.1
22 stars 22 forks source link

RFC: overcommit memory by default #234

Open ddiss opened 9 months ago

ddiss commented 9 months ago

I was under the impression that qemu overcommited memory by default, until I ran into good ol':

qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory

Some straceing later, against a VM with 50G memory assigned, showed that qemu attempts to mmap() the entire guest memory area via:

mmap(0x7ef693e00000, 53687091200, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

Guest memory overcommitting should be as simple as ensuring that MAP_NORESERVE is present in the above flags. Working backwards from qemu_ram_is_noreserve(), the qemu command line parameters to ensure MAP_NORESERVE appear to be:

                -object memory-backend-ram,id=pc.ram,size=${mem},reserve=off \
                -machine memory-backend=pc.ram \

...instead of the simple -m $mem parameter that we currently use.

I'm not sure of the best way to let rapido use memory overcommit. There are a few options:

  1. replace the _rt_qemu_resources_get() -m $mem snippet with the above -object memory-backend-ram...reserve=off parameters.
  2. leave things as-is and let users manually tweak things via QEMU_EXTRA_ARGS
  3. let cut scripts decide whether VM memory should be overcommitted or not via a new helper / cpio flag-file _rt_mem_overcommit_resources_set, with (1) logic enabled if the overcommit flag is detected.

If overcommit were possible via a simple -overcommit-memory parameter (alongside -m $mem), then I'd definitely opt for (2), but the fact that size=${mem} needs to be present with the -object memory-backend-ram parameter means that VM memory resource requirements specified in the cut script via _rt_mem_resources_set can't easily be considered at boot time. I really, really, really don't want rapido to go down the path of acting as a map between local configs and qemu cli parameters similar to libvirt. I'm concerned that option (1) leads us further in that direction.

pevik commented 9 months ago

Can you please prepare PR with 1. or 3.? We could test it for some time before merged.

Werkov commented 9 months ago
mmap(0x7ef693e00000, 53687091200, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)

But MAP_POPULATE is not set. Hm, ~MAP_NORESERVE actually strongly depends on available swap space. That leads me to 4th option:

Leave the things as-is and let the host admin tweak overcommit params (e.g. /proc/sys/vm/overcommit_ratio if you don't want to add swap for this qemu)