chromiumembedded / cef

Chromium Embedded Framework (CEF). A simple framework for embedding Chromium-based browsers in other applications.
https://bitbucket.org/chromiumembedded/cef/
Other
3.36k stars 467 forks source link

The new OSR implementaion has an (artificial) limit to the frame rate #1368

Closed magreenblatt closed 1 year ago

magreenblatt commented 10 years ago

Original report by Anonymous.


Original issue 1368 created by fa3rhan on 2014-09-04T21:39:21.000Z:

What steps will reproduce the problem?

  1. Use OSR with CefBrowserSettings.windowless_frame_rate = 60 or cefclient with '--off-screen-rendering-enabled --off-screen-frame-rate=60'
  2. load a website with a perpetual css animation or webGL

What is the expected output? What do you see instead?
It should output 60fps, instead it does (at most) 30

What version of the product are you using? On what operating system?
Cef branch 2062 on linux

Please provide any additional information below.
Chromium does render at 60fps (as shown by running webGL benchmarks), but CropScaleReadbackAndCleanMailbox in render_widget_host_view_osr.cc isn't done in time, resulting in every other frame to get dropped, so OSR uses exacly half of the chromium frames.
When adding another browser that renders at the same time, the frame rate goes down to 15 fps each, with 3 browsers it's 10 each.
Note that this has nothing to do with the CefBrowserSettings.windowless_frame_rate limitation and is not a performance problem, it happens with tiny browser windows all the same.

here is what I think is the problem:
Chromium runs with vsync per default which means the GL thread gets blocked when a buffer swap is requested. When Cef queues the download of the frame buffer contents into system memory, the GL thread is already blocked, so the download will be done after the GL thread is finished waiting for vsync and cef recieves the copy finished callback too late for the next frame to be used.

magreenblatt commented 10 years ago

Original comment by Anonymous.


Comment 1. originally posted by markus.lanner on 2014-09-05T11:03:28.000Z:

known issue
https://code.google.com/p/chromiumembedded/issues/detail?id=1006

magreenblatt commented 10 years ago

Original comment by Anonymous.


Comment 2. originally posted by fa3rhan on 2014-09-05T11:09:20.000Z:

using a GL/D3D texture/surface provided by the client is a completely different approach and has nothing to do with the vsync problem described in this issue.

magreenblatt commented 10 years ago

Comment 3. originally posted by magreenblatt on 2014-09-05T16:53:37.000Z:

It's possible to set the vsync interval on a per-compositor basis using |compositor_->vsync_manager()->SetAuthoritativeVSyncInterval(...))| in the CefRenderWidgetHostViewOSR constructor. OnSwapCompositorFrame should be called at or near the vsync interval. This currently results in an async call to DelegatedFrameHost::RequestCopyOfOutput which is rate limited based on off-screen-frame-rate.

It would be nice if we could eliminate the async call to RequestCopyOfOutput.
We could then just set the vsync interval based on off-screen-frame-rate.

magreenblatt commented 10 years ago

Comment 4. originally posted by magreenblatt on 2014-09-11T19:47:32.000Z:

I've started a conversation about it here: https://groups.google.com/a/chromium.org/forum/\#!topic/graphics-dev/02jptXZMtTM

magreenblatt commented 10 years ago

Comment 5. originally posted by magreenblatt on 2014-10-23T17:27:22.000Z:

magreenblatt commented 9 years ago

Comment 6. originally posted by magreenblatt on 2014-12-16T21:52:01.000Z:

The attached patch against trunk revision 1959 adds support for the `--enable-begin-frame-scheduling` command-line flag which clamps the frame rate in all processes to the off-screen-frame-rate value. The below statistics were gathered on a 4 core Ubuntu 14 LTS 64-bit VM by running `ps -C cefclient -o %cpu,%mem,cmd` after about 2 minutes. %CPU is the CPU time used divided by the time the process has been running (cputime/realtime ratio), expressed as a percentage.

Performance without enable-begin-frame-scheduling (using vsync=30fps in the browser process and vsync=60fps in all subprocesses):

$ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/

%CPU %MEM CMD
14.9 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
24.3 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=520
0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
24.8 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-deferre

Performance with enable-begin-frame-scheduling and various frame rate values:

$ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --enable-begin-frame-scheduling --off-screen-frame-rate=X

X=60:
%CPU %MEM CMD
20.5 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
29.1 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=510
0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
22.0 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

X=30:
%CPU %MEM CMD
14.7 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
20.8 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=511
0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
15.6 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

X=15:
%CPU %MEM CMD
9.7 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --off-screen-rendering-enabled -
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
13.6 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-process --channel=512
0.0 0.1 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=gpu-broker
10.0 0.4 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --enable-begin-f

Inspection of trace output shows that CefRenderWidgetHostViewOSR::SendBeginFrame, CefRenderWidgetHostViewOSR::OnFrameCaptureSuccess and CefRenderWidgetHostViewOSR::OnSwapCompositorFrame are called at the specified frame rate frequency while CopyOutputRequest and Compositor::Draw are called at 2x the frequency (expected, since the output is requested an additional time per frame via RequestCopyOfOutput).

In this example (using requestAnimationFrame) CPU usage is highly correlated with the frame rate. Renderer process CPU usage with more complex content will show lower correlation with the frame rate.

magreenblatt commented 9 years ago

Original comment by Anonymous.


Comment 7. originally posted by fa3rhan on 2014-12-18T20:07:51.000Z:

i've tried the patch and can confirm the CPU/frame rate correlation.

however, visually --off-screen-frame-rate=60 and --off-screen-frame-rate=30 still look exactly the same (as opposed to loading it in chrome where the fps difference is like night and day)

magreenblatt commented 9 years ago

Comment 8. originally posted by magreenblatt on 2014-12-19T12:33:37.000Z:

@ comment 7.: In my testing with the current design (using CopyOutputRequest) the Compositor::Draw call cannot complete at much above 60fps, so OnPaint is only being called at about 30fps. Specifying a higher frame rate with this design results in higher CPU usage for no gain (because the additional frames are dropped).

magreenblatt commented 9 years ago

Comment 9. originally posted by magreenblatt on 2014-12-19T14:02:24.000Z:

Looking at performance now with GPU disabled. This avoids expensive readback from the GPU in exchange for losing some features (3D CSS is supported but not WebGL).

Windowed rendering performance with current trunk (no changes) with GPU disabled (all processes at 60fps):

$ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --disable-gpu --disable-gpu-compositing
%CPU %MEM CMD
10.8 0.8 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
17.4 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

Performance with current trunk (no changes) with GPU disabled (browser process at 30fps, other processes at 60fps)

$ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled --disable-gpu --disable-gpu-compositing
%CPU %MEM CMD
26.2 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
15.5 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

Performance with 1916 branch (old software-only OSR implementation, no 3D CSS or WebGL support):

$ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled
%CPU %MEM CMD
10.1 0.8 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --url=http://mrdoob.co
0.0 0.1 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --url=http://mrdoob.co
0.0 0.3 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=zygote --lang=e
0.0 0.0 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=zygote --lang=e
6.6 0.4 /home/marshall/Downloads/cef_binary_3.1916.1931_linux64_client/Release/cefclient --type=renderer --disa

Attached is a new patch (currently tested on Windows and Linux only) that uses a custom SoftwareOutputDevice implementation which supports direct rendering to bitmap from OnSwapCompositorFrame and consequently avoids the extra Compositor::Draw when GPU is disabled (it's no longer necessary to call RequestCopyOfOutput).

Performance with new patch with GPU disabled (browser process at 30fps, other processes at 60fps):

$ cefclient --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --off-screen-rendering-enabled --disable-gpu --disable-gpu-compositing
%CPU %MEM CMD
13.0 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
17.0 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

Performance with new patch with GPU disabled and begin frame scheduling enabled with various frame rates (all processes at Xfps):

$ cefclient --off-screen-rendering-enabled --url=http://mrdoob.com/lab/javascript/requestanimationframe/ --disable-gpu --disable-gpu-compositing --enable-begin-frame-scheduling --off-screen-frame-rate=X

X=60
%CPU %MEM CMD
15.1 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
17.4 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

X=30
%CPU %MEM CMD
9.4 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
11.2 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

X=15
%CPU %MEM CMD
5.5 0.9 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --url=http://mrdoob.com/lab/javascr
0.0 0.3 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
0.0 0.0 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=zygote --lang=en-US
6.2 0.5 /home/marshall/code/chromium_git/chromium/src/out/Release/cefclient --type=renderer --disable-gpu-compo

Conclusions:

magreenblatt commented 9 years ago

Comment 10. originally posted by magreenblatt on 2014-12-20T19:57:18.000Z:

@ comment 9.: Attached is an updated patch that's also tested to work with Retina displays on OS X.

magreenblatt commented 9 years ago

Comment 11. originally posted by magreenblatt on 2014-12-20T20:19:11.000Z:

Some OSR unit tests are failing with the custom SoftwareOutputDevice implementation (run `cef_unittests --gtest_filter=OSRTest.* --disable-gpu --disable-gpu-compositing`). Need to fix those before merging this patch.

magreenblatt commented 9 years ago

Comment 12. originally posted by magreenblatt on 2014-12-29T17:24:36.000Z:

@ comment 11.: Need to test all of the following combinations:

$ cef_unittests --gtest_filter=OSRTest.*
$ cef_unittests --gtest_filter=OSRTest.* --enable-begin-frame-scheduling
$ cef_unittests --gtest_filter=OSRTest.* --disable-gpu --disable-gpu-compositing
$ cef_unittests --gtest_filter=OSRTest.* --enable-begin-frame-scheduling --disable-gpu --disable-gpu-compositing

magreenblatt commented 9 years ago

Comment 13. originally posted by magreenblatt on 2015-01-01T16:53:37.000Z:

Trunk revision 1960 adds support for begin frame scheduling and direct rendering when GPU compositing is disabled.

magreenblatt commented 9 years ago
magreenblatt commented 9 years ago
magreenblatt commented 9 years ago
magreenblatt commented 9 years ago