cefsharp / CefSharp

.NET (WPF and Windows Forms) bindings for the Chromium Embedded Framework
http://cefsharp.github.io/
Other
9.87k stars 2.92k forks source link

GPU and WPF #654

Closed bjarteskogoy closed 8 years ago

bjarteskogoy commented 9 years ago

Would the solution this article describe make gpu rendering possible for WPF? http://www.codeproject.com/Articles/28526/Introduction-to-D-DImage

bjarteskogoy commented 9 years ago

Or this? http://www.magpcss.org/ceforum/viewtopic.php?f=8&t=11635

amaitland commented 9 years ago

What exactly are you trying to achieve? As of 2062 OSR rendering has GPU acceleration where possible (there's a number of factors like graphics card features that come into play)

If you need extra speed then try 2171, it's noticeably faster

bjarteskogoy commented 9 years ago

I didn't know OSR has GPU acceleration now. I will distribute the approroate dlls for that. What I was wondering was actually if it would be possible to render directly via direct(something) which could be embedded in WPF without airspace-issues. If that could give "windowed" mode instead of offscreen with WPF maybe it would give performance/memory benefits?

amaitland commented 9 years ago

May as well just use the WindowsFormsHost with WinForms control.

bjarteskogoy commented 9 years ago

But that gives airspace issues.

amaitland commented 9 years ago

But that gives airspace issues.

That was supposed to be fixed in WPF 4.5 guess it never eventuated.

Anyways, to support the full set of WPF features like transparency then I'm not sure anything else is possible.

You could possibly take the buffer passed from OnPaint and look to render it directly somehow. https://github.com/cefsharp/CefSharp/blob/master/CefSharp.Core/Internals/RenderClientAdapter.h#L86

I think the current implementation could be cleaned up a bit, and possibly tweaked to improve performance.

amaitland commented 9 years ago

I think changing to http://msdn.microsoft.com/en-us/library/system.windows.media.imaging.writeablebitmap%28v=vs.110%29.aspx would probably give quite a boost in performance whilst implementing the same fundamentals

amaitland commented 9 years ago

At the same time implementing dirtyRects would help as well.

bjarteskogoy commented 9 years ago

Where would one start on this?

amaitland commented 9 years ago

BitmapInfo is the class which bridges the c# - c++ divide, representing the received bitmap from Cef. If you look at it usages you can get an idea of how it fits together.

https://github.com/cefsharp/CefSharp/blob/master/CefSharp/Internals/BitmapInfo.cs

Also have a look here https://github.com/cefsharp/CefSharp/blob/master/CefSharp.Wpf/ChromiumWebBrowser.cs#L534

You'll see that the content for the control is an image. Basically you'd set the Source to be a WritableBitmap, then write the data provided by Cef in the OnPaint method. Currently a new InteropBitmap is created and assigned to the image Source every time OnPaint is called. Using the WritableBitmap, you could then take the directRects info from Cef and only update the relevant areas, which in theory should be faster again.

bjarteskogoy commented 9 years ago

Useful info maybe: http://www.charlespetzold.com/blog/2012/08/WriteableBitmap-Pixel-Arrays-in-CSharp-and-CPlusPlus.html

amaitland commented 9 years ago

Interestingly, the C++ program only has better performance than the C# program when compiled in the Release configuration, and then it had demonstrably better performance. Here are the frame rates:

Probably have to read the article in full, though it does hint that performing some of the operations in C++ may have some performance benefits :+1:

amaitland commented 9 years ago

Maybe worth looking at http://writeablebitmapex.codeplex.com/

bjarteskogoy commented 9 years ago

Maybe let IRenderBrowser just expose a WritableBitmap-type property and let RenderClientAdapter do the job?

amaitland commented 9 years ago

Something like that. It may well be worth abstracting out the rendering, so multiple implementations can be supported in parallel.

amaitland commented 9 years ago

Stepping through the current code and it's actually more efficient than I originally though, reuses the same InteropBitmap by copying the buffer directly into it's file handle. I don't believe any of that would be necessary with WritableBitmap

bjarteskogoy commented 9 years ago

I have started on something here: https://github.com/bjarteskogoy/CefSharp/commit/b311da92196971f8619d09449396b8a189ecb08f

My C++ isn't all thath so I have no Clue how to do what's needed in https://github.com/bjarteskogoy/CefSharp/commit/b311da92196971f8619d09449396b8a189ecb08f#diff-abed174f3903eb62b98999938e4f67d0R162

bjarteskogoy commented 9 years ago
  1. Support for delivering the final composited result to a GL/D3D texture/surface provided by the client in order to minimize copies and CPU load (https://code.google.com/p/chromiumembedded/issues/detail?id=1006#c7) Could this render directly to a D3DImage, when it's implemented?
amaitland commented 9 years ago

Could this render directly to a D3DImage, when it's implemented?

Don't see why not, need someone who knows D3D. The big question is when will it be implemented? My guess is we'd be waiting a while. If we do go down this path then I'd vote for a pluggable rendering architecture.

amaitland commented 9 years ago

My C++ isn't all thath so I have no Clue how to do what's needed in bjarteskogoy@b311da9#diff-abed174f3903eb62b98999938e4f67d0R162

http://msdn.microsoft.com/en-us/library/system.windows.media.imaging.writeablebitmap.backbuffer%28v=vs.110%29.aspx

Maybe copy the buffer directly into WritableBitmap.BackBuffer to start with?

amaitland commented 9 years ago

Hmm, did some rough changes and the code will end up being much cleaner, not sure we'll actually get any performance gain.

I did have another look at D3DImage, and it maybe possible after all. So I'll look into it a little further and let you know what I find.

amaitland commented 9 years ago

Possibly a useful link http://jmorrill.hjtcentral.com/Home/tabid/428/EntryId/437/Direct3D-10-11-Direct2D-in-WPF.aspx

amaitland commented 9 years ago

I think we'd need to use something like SharpDx to implement this now. So I'd say shelve this and look at it further when upstream support eventuates.

bjarteskogoy commented 9 years ago

Are you suggesting dropping the WriteableBitmap too for now?

amaitland commented 9 years ago

Having done some research into writeablebitmap vs interopbitmap I'm not convinced it's worth the effort. Are you still using .Net 4.0? Looks like upgrading to .Net 4.5 would give a boost in performance.

https://connect.microsoft.com/visualstudio/feedback/details/585875/interopbitmap-is-way-less-performant-in-net-4-0-vs-net-3-5

It's still probably worth cleaning up the implementation we do have to see if we can squeeze out a few extra cycles.

bjarteskogoy commented 9 years ago

OK. We're still on .Net 4. I'm not sure when we can drop Windows XP/2003. (marketing discussion, obviusly) If the following statement is true, would it reduce the memory usage if we implement the writeablebitmap solution and do the stuff in c++? "However, there is also a significant difference in how a C++ program updates a WriteableBitmap. The C++ program does not need to create a local byte array for the pixels and then transfer this array of pixels to the bitmap through a Stream." (http://www.charlespetzold.com/blog/2012/08/WriteableBitmap-Pixel-Arrays-in-CSharp-and-CPlusPlus.html)

bjarteskogoy commented 9 years ago

By the way. If the InteropBitmap performance issue is related to the installed runtime version and not the project target Version, we could always install .Net 4.5 as a prerequisite instead of .Net 4 on supported windows versions.

amaitland commented 9 years ago

If the following statement is true, would it reduce the memory usage if we implement the writeablebitmap solution and do the stuff in c++?

It's entirely possible, seems like there's so many different opinions about which is faster and for which scenario. It's probably a matter of fully implementing it to compare the two. Only thing that people can agree on is that D3D is the fastest.

amaitland commented 9 years ago

By the way. If the InteropBitmap performance issue is related to the installed runtime version and not the project target Version, we could always install .Net 4.5 as a prerequisite instead of .Net 4 on supported windows versions.

I'd say for the small cost of trying it out on a machine to see what sort of performance increase it makes, that's gotta be the cheapest option time wise.

amaitland commented 9 years ago

The C++ program does not need to create a local byte array for the pixels and then transfer this array of pixels to the bitmap through a Stream

With InteropBitmap it's currently copying directly into it's BackBuffer, so I imagine it's a similar sort of operation speed wise.

bjarteskogoy commented 9 years ago

Ok. So the only obvious difference would be the dirtyrect support?

bjarteskogoy commented 9 years ago

InteropBitmap on .Net 4.5 has support for invalidating rectangles too, http://msdn.microsoft.com/en-us/library/hh141010(v=vs.110).aspx.

amaitland commented 9 years ago

InteropBitmap on .Net 4.5 has support for invalidating rectangles too, http://msdn.microsoft.com/en-us/library/hh141010(v=vs.110).aspx.

Nice to know :+1: Would mean upgrading to .Net 4.5, which I'm unsure when they'll happen.

Did you get a chance to test on a machine with .Net 4.5 installed?

amaitland commented 9 years ago

For the fun of it I've implemented WritableBitmap and performance wise It's probably a little slower, even with using Dirty Rect support, which was quite surprising.

If your interested then check out https://github.com/amaitland/CefSharp/commits/enhancement/writeablebitmap

trevorlinton commented 9 years ago

Hi All,

I've been playing around with IWebBrowser (the C OLE MSHTML object) for Internet Explorer and how I might be able to get it to avoid the airspace issues in WPF (while being hardware accelerated).

I've recently started playing with CEFSharp and saw you were bitblt'ing with the interop image. I was wondering if you've thought of using a D3DImage object and bouncing the HDC. Surprisingly in windows 7+ if you bounce a HDC to a D3DImage the underlying surfaces can recognize you're actually using a DirectX->DirectX call and so long as there isn't any translation of surface types it takes place in video memory (huzzah!)

You can see a (once again, playground example) here: https://gist.github.com/trevorlinton/7abac1c5f044be153833 The results were fairly promising although not perfect. I'm curious if anyone on this thread has had any experience and if I should try my luck at implementing this on CEFSharp.

Here's the gist: https://gist.github.com/trevorlinton/7abac1c5f044be153833

trevorlinton commented 9 years ago

A little embarrassed at the code i decided to reduce it to the most important parts:

  1. Initialize a D3D device, get its surface DC
  2. Get the DC of the hwnd (or if its offscreen, even better, just a memory DC, hardware accelerated off of a directx surface is optimal here).

Then, on any invalidation:

  // WPF Managed C++ code
  void SomeInvalidationEvent(Object^ sender, EventArgs^ e) {
    if (d3dimg->IsFrontBufferAvailable)
    {
      // Try lock will begin to fail persistently, setting the timespan higher helps,
      // but once it locks it seems unable to re-lock, levels tried were 2, and 22.
      d3dimg->TryLock(Duration(TimeSpan(22)))
      d3dimg->Lock();
      RenderD3D();
      d3dimg->AddDirtyRect(DirtyRect);
      d3dimg->Unlock();
    } 
  }

  void RenderD3D() {
    // Note, this is C++ but interops for C# are available
    // d3ddev is an initialized directx device set as the back buffer, if theres
    // an existing directx back buffer for CEF that would make this all the simpler, and
    // this entire function probably isn't even necessary....
    // hdcFrom is the HDC coming off of CEF somewhere? from a hwnd? from internals?
    if (SUCCEEDED(d3ddev->BeginScene()))
    {
      HDC surfaceDC;
      d3dev->GetDC(&surfaceDC); 
      BitBlt(surfaceDC, 0, 0, DpiWidth, DpiHeight, hdcFrom, 0, 0, SRCCOPY);
      d3ddev->ReleaseDC(surfaceDC);
      d3ddev->EndScene();
    }
}
amaitland commented 9 years ago

I'm curious if anyone on this thread has had any experience and if I should try my luck at implementing this on CEFSharp.

I think the general plan was to wait for CEF issue 1006 to be resolved (new link as they moved to bitbucket https://bitbucket.org/chromiumembedded/cef/issue/1006)

Are there additional libs required for the D3D side of things?

amaitland commented 9 years ago

If you can knockup something then feel free to submit a PR :+1:

trevorlinton commented 9 years ago

@amaitland I don't believe so, if i'm reading the issue tracker correctly they plan on allowing the client to provide the d3d device (which is great, and easy from WPF without any new libs). However my gut says they're using (or will be) only DX11 which will require some interops to convert the pointer structure, but i'm not sure at the git-go if that requires linking against the DX lib or if its a shared library call. I suppose i'll keep an eye on this, might as well proceed with the work on CEFSharp's side and for the moment tie it to the shared memory area where CEF is writing to, it won't be accelerated, but it could very likely be faster than memory copies in WPF. I'll give it a go and let you know the results.

amaitland commented 9 years ago

I'll give it a go and let you know the results

Cool, hopefully the code is relatively easy to follow now, there's a couple of parts that I think still could use a rewrite. Anyways, if you have any questions then let me know.

amaitland commented 9 years ago

@trevorlinton Any followup questions? I know very little about DX, so I'm curious to see what's involved.

trevorlinton commented 9 years ago

@amaitland Yes, i have it implemented, but have a few bugs to work out and i'll submit a pull request.

A larger issue is it seems the bitblt method currently used uses events(non-rendering-bound) or timers instead of redrawing directly from the rendering event (unsure if i made a mistake or if its built that way, might have to do with redirecting to the ui thread). It's making it difficult to get quantitative data on how much faster (or if its faster) then the bitblt method used. Just from eyeballing it everything seems faster... and I know the memory use is lower, so thats a plus.

amaitland commented 9 years ago

If you could base your PR on the 2272 branch that would be ideal :+1:

I'm unsure which timer your exactly referring to, the tooltip one? If you submit a pr now you can always push more commits before it's merged. Might make it easier to discuss inline?

bjarteskogoy commented 9 years ago

Any news here @trevorlinton ?

amaitland commented 8 years ago

Doesn't seem like anything is happening with this, so closing.

Anyone wishes to discuss this further, jump on Gitter Chat to discuss your ideas or tell us about the new PR your working on :wink: