microsoft / openvmm

Home of OpenVMM and OpenHCL.
http://openvmm.dev/
MIT License
1.56k stars 82 forks source link

cvm: Correctly set the amount of DMA memory for NVME and MANA #387

Open jaredwhitedev opened 3 days ago

jaredwhitedev commented 3 days ago

The code determining the amount of DMA memory for NVME on CVMs:

    // TODO: determine actual memory usage by NVME/MANA. hardcode as 10MB
    let device_dma = 10 * 1024 * 1024;

link Is insufficient for local storage in our lab testing (TDX VM, 64 VPs, 4 CCs with 16 QPs each). While running storage workloads, we observe errors like the following:

[  198.545471] nvme_driver::driver: ERROR  failed to create io queue, falling back cpu=0x35 fallback_cpu=0x33 error=failed to create io queue pair 4: failed to allocate pages for queue requests: failed to allocate shared mem: unable to allocate shared pool size 128 with tag vfio dma

We also see fewer interrupts than we expect.

Increasing this value such that we no longer saw these errors led to significantly increased IOPS and reduced CPU usage.

jaredwhitedev commented 3 days ago

Looks like the original code was committed by @mingweishih

yupavlen-ms commented 3 days ago

@chris-oo FYI

This value is only used to allocate shared pool, looks like this is what being used in their tests?

chris-oo commented 3 days ago

the TODO here is to figure out how much we actually need per device, and reserve appropriately.