virtio-win / kvm-guest-drivers-windows

Windows paravirtualized drivers for QEMU\KVM
https://www.linux-kvm.org/page/WindowsGuestDrivers
BSD 3-Clause "New" or "Revised" License
2k stars 384 forks source link

NetKVM: Allocate Rx buffers appropriately with RSC support in the driver #861

Open annie-li opened 1 year ago

annie-li commented 1 year ago

Is your feature request related to a problem? Please describe. The current NetKVM driver allocates Rx buffers based on whether the driver supports RSC, instead of whether the RSC is enabled on the VirtIO network adapter or tso is enabled on the host side. For example, for NetKVM driver with NDIS 6.3 and above, buffers with roughly total size 64K are allocated for Rx side. However, if the tso is disabled from host side(guest_tso4/6=off w/o VIRTIO_NET_F_GUEST_TSO4/6) or the RSC is disabled on the VirtIO network adapter in the Windows guest, the host sends MTU size packets to the guest. In this case, allocating the 64K size buffer is a waste.

Describe the solution you'd like ParaNdis_InitializeContext initializes the nMaxDataSizeHwRx as roughly 64K as following,

....
#if PARANDIS_SUPPORT_RSC
pContext->MaxPacketSize.nMaxDataSizeHwRx = MAX_HW_RX_PACKET_SIZE;
pContext->MaxPacketSize.nMaxFullSizeOsRx = MAX_OS_RX_PACKET_SIZE;
#else
...

In CreateRxDescriptorOnInit funcition, Rx buffers are allocated based on the size of nMaxDataSizeHwRx initialized as above,

...
    ULONG ulNumPages = m_Context->MaxPacketSize.nMaxDataSizeHwRx / PAGE_SIZE + 2;
...
while (ulNumPages > 0)
{
..
allocate buffers
...
}
...

In InitializeRSCState function, the tso and RSC value are being negotiated and then VIRTIO_NET_F_GUEST_TSO4 feature will be Acked accordingly. So I suggest to set the value of pContext->MaxPacketSize.nMaxDataSizeHwRx in InitializeRSCState after the negotiation of tso and RSC has been done, i.e. set nMaxDataSizeHwRx as MTU size if VIRTIO_NET_F_GUEST_TSO4 is not Acked. For example, if VIRTIO_NET_F_GUEST_TSO4 is not Acked, add following,

pContext->MaxPacketSize.nMaxDataSizeHwRx = pContext->MaxPacketSize.nMaxFullSizeOS + ETH_PRIORITY_HEADER_SIZE; 
pContext->MaxPacketSize.nMaxFullSizeOsRx = pContext->MaxPacketSize.nMaxFullSizeOS;

Additional context This change can help reducing the total memory especially with multiple queue or cpus.

YanVugenfirer commented 1 year ago

Hi @annie-li ,

Thank you very much for opening the discussion.

I want to add some points:

  1. Being able to handle RSC is performance feature. Without RSC RX performance will suffer. So having multi-queue on one hand, but no RSC on another is a combination that will not have full performance potential
  2. Even with RSC and multi-queue, we can control memory consumption by limiting queue size on the host side. It might make sense if RSC is enable, as we assume that RX packets should be aggregated.

@ybendito Do you have something to add?

Best regards, Yan.