robmikh / Win32CaptureSample

A simple sample using the Windows.Graphics.Capture APIs in a Win32 application.
MIT License
285 stars 84 forks source link

DDA capture slow framerate on Windows 10 vs Windows 8.1 VMs #73

Open weary-adventurer opened 1 month ago

weary-adventurer commented 1 month ago

Hi, I've made a simple program that uses DXGI Desktop Duplication API to test the capture performance. It's just a while(1) loop that calls AcquireNextFrame, then CopyResource, Map and Unmap to simulate reading the framebuffer. Source is available here: https://gist.github.com/weary-adventurer/241eef5a6205403422a3573ce50f8672

I'm running it on QEMU VMs without GPUs and I'm getting very different performance on Windows 8.1 and Windows 10 VMs.

On Windows 8.1, I'm getting about 64 fps, which is the same as the refresh rate of the QEMU virtual display (64hz):

win81_deskdupl

The capture takes minimal time and happens at the same consistent rate as vsync:

win81_gpuview

But on Windows 10, running the same program on the same host I'm getting low 20 to 30 fps while constantly redrawing the screen (dragging windows, playing animation or video):

win10_deskdupl

It seems like vsyncs happen slower and are inconsistent, often skipping or taking longer:

win10_gpuview

I've also tried Windows.Graphics.Capture with similar results.

The only clue I have is: If I set DWMFRAMEINTERVAL dword value in HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server\WinStations (from https://learn.microsoft.com/en-us/troubleshoot/windows-server/remote/frame-rate-limited-to-30-fps), enable RDP, then connect to an RDP session, then the capture program runs at more than 64 fps!

Is there any way to get DDA capture to capture at a faster rate in a regular session?

weary-adventurer commented 1 month ago

Another clue: When I call DwmGetCompositionTimingInfo on Windows 10 I'm getting these numbers:

rateRefresh = 64/1 = 64.00
rateCompose = 64/1 = 64.00
qpcRefreshPeriod = 1562500 = 15.62ms

So DWM supposedly also thinks it's running at 64hz. But it looks like that's not true.

weary-adventurer commented 1 month ago

Windows 10 DDA capture inside RDP session with DWMFRAMEINTERVAL registry hack shows consistent vsyncs and the dwm.exe bars don't stretch until vsync lines:

win10_rdp_gpuview

Compared with Windows 10 DDA capture inside a regular session:

image

robmikh commented 1 month ago

I'll need to take a closer look, but my immediate reaction is that this sounds like either a QEMU issue or something lower level than the system compositor. We can't influence the vsync in a way that would make it inconsistent like that. Even more telling that you can make it behave using RDP, as RDP uses an indirect display driver and likely circumvents the QEMU display driver.

Thanks for the report!

weary-adventurer commented 1 month ago

It also happens with basic VGA (basicdisplay.sys), QXL drivers (qxldod.sys) and no display at all (-vga none), but only since Windows 10.

On Windows 8 and 8.1 it gets consistent 64 fps as seen in the first screenshot with QXL drivers.

weary-adventurer commented 1 month ago

I've profiled the same Windows 10 with standard VGA just now (-vga std or -device vga) with this setup:

The test was running the DDA capture program, GPUView log and UFO Test in a browser.

image image

As usual, connecting to RDP session and using DWMFRAMEINTERVAL hack guarantees more than 64fps: image

In a normal session, UFO Test measure 64fps and 64hz but capture shows only about ~30fps. It seems like maybe the browser is able to render at 64fps but DWM only renders at 30?

GPUView shows a similar picture but with a few differences. Vsync is still a bit inconsistent but the frames don't actually come on every vsync timing and are sometimes delayed. Also there a few abnormal peaks:

win10_vga_peaks

Peak zoomed in:

image

After the peaks there are some delays after which I see these chunks (edited image):

win10_vga_avg

weary-adventurer commented 1 month ago

Another profile from same Windows 10, this time with VirtIO VGA (-device virtio-vga) and viogpudo.sys driver (https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/viogpu/viogpudo)

image

Active signal resolution -1 x -1?

image

RDP session with DWMFRAMEINTERVAL hack as usual, 64fps+

Regular session is about ~20-30 fps (same results)

Slightly different GPUView, no more peaks in DDA capture program but instead some peaks in dwm.exe:

win10_virtiovga_gpuview

Zoomed in:

win10_virtiovga_2