robmikh / Win32CaptureSample

A simple sample using the Windows.Graphics.Capture APIs in a Win32 application.
MIT License
294 stars 89 forks source link

Performance of the Windows.Graphics.Capture API #45

Closed LordTrololo closed 2 years ago

LordTrololo commented 2 years ago

I would like to ask you what performance do you get when using Widows.Graphics.Capture API ?

Here is how I measure the FPS coming into OnFrameArrived.

void SimpleCapture::OnFrameArrived(winrt::Direct3D11CaptureFramePool const& sender, winrt::IInspectable const&)
{
    std::atomic<std::chrono::system_clock::time_point> currentTime = std::chrono::system_clock::now();

    auto durationMilisecondsSinceLastFrame = std::chrono::duration_cast<std::chrono::milliseconds>(currentTime.load() - lastFrameTime.load()).count();
    lastFrameTime = currentTime.load();  //add std::atomic<std::chrono::system_clock::time_point> lastFrameTime into SimpleCapture.h

    auto passFrameFPS = 1000 / durationMilisecondsSinceLastFrame;
...

Here are my results: 1) This demo example, capturing Notepad++ my passFrameFPS is around 30FPS.

2) Another Project, C++ Win32, capturing the monitor I also get around 30FPS.

I would like to know what is the performance other people get ? Also, is there a way to increase the rate at which OnFrameArrived is invoked ?

Thanks

robmikh commented 2 years ago

Generally it depends on the refresh rate of your primary monitor and how often the DWM decides to render. If we detect no changes, we may decide not to render a new frame.

In Windows 11, this got more aggressive. What version/build of Windows are you trying this on?

I have a test that presents a swap chain as fast as it can and measures the render rate and capture rate. Generally this is locked to the refresh rate of the primary monitor (in my case, 144hz). If you're curious you can try it yourself:

https://github.com/robmikh/captureadhoctest

CaptureAdHocTest.exe fullscreen-rate -fw
<press ESC after a few seconds>
LordTrololo commented 2 years ago

Here are the results I get when using CaptureAdHocTest.exe fullscreen-rate -fw

Average rendered frame time: 0.191875ms
Number of rendered frames: 56329
Average capture frame time: 40.531015ms
Number of capture frames: 266
Test result: PASSED

If we say that averageFPS = 1000/ averageCaptureFrameTime

then it means that I get around 25 FPS in your test. This is consistent with around 30 FPS I get using my own test.

I have an i7, nVidia RTX 2060 GPU and Windows 10 Pro OS Build 19044.1586.

What do you get when you run CaptureAdHocTest.exe fullscreen-rate -fw ?

P.S while I run your test I only see a red screen with a golden border. That is normal I presume ?

robmikh commented 2 years ago

On a similar build with a RTX 2070 SUPER I get (roughly 10 seconds):

Average rendered frame time: 3.666113ms
Number of rendered frames: 2180
Average capture frame time: 6.964674ms
Number of capture frames: 1147

Are you using a laptop by chance? Laptops will often have their panel connected to the integrated GPU, so if you capture using the dedicated card we're forced to transfer the bits over system memory. This will have a performance penalty. It's recommended that you use the same adapter the DWM is using. EDIT: There are some laptops that behave differently under certain conditions.

Could you share a DxDiag from your machine? If you don't want to share all of that data, I'm mainly interested in the model of computer and your monitor configuration (resolution, refresh rate, internal/external).

And yes, what you're seeing visually with the test is expected. I'm just clearing to red and presenting.

LordTrololo commented 2 years ago

Currently I am working remote (I am connected to my PC via Windows Remote connection), so maybe thats an issue as well: image

I have to check it again on tuesday when I am back in office. But now at least I have your benchmark to make a rough comparison, thanks for that.

robmikh commented 2 years ago

Ah! RDP is probably the issue, I get similar numbers over RDP. I think the IDD that's used is capped at 30 fps.

robmikh commented 2 years ago

I'm pretty certain RDP is the culprit. I just updated CaptureAdHocTest to add a command that prints information about the connected monitors. On my local system, I get:

0 - ROG PG278Q - 144 Hz
1 - DELL U2715H - 59 Hz
2 - DELL U2715H - 59 Hz

When I use this machine to RDP into another machine, the remote machine reports this:

0 -  - 32 Hz
1 -  - 32 Hz
2 -  - 32 Hz

You can try this yourself with:

CaptureAdHocTest.exe monitor-info

I'm going to close this issue for now. If you find that you still see this locally, feel free to reopen this issue.

LordTrololo commented 2 years ago

Just one final question.

Is it possible to influence the frame rate at which Capturer, can this vaule be set or is it alway the monitor refresh rate ?

robmikh commented 2 years ago

Currently the only way to effect the capture rate is to manipulate the lifetime of the Direct3D11CaptureFrame objects. If the DWM is starved of frames, we won't draw a new frame.

What's the scenario? Are you trying to throttle or make the compositor render more often?

LordTrololo commented 2 years ago

We use the capturer to feed a webrtc connection with new frames. It would be beneficial if we could for example decrease the FPS from 60 to 30 (because we want to sacrifice speed for quality/bandwith). Right now we achieve this by simply skipping some frames in the feeding process.

Ideally it would be nice to simply setup the capturer in a way to have some target frame rate that it uses.

robmikh commented 2 years ago

Good to know! Thanks for the feedback. I can't promise anything at this time, but it's something we're looking at.