WebPlatformForEmbedded / WPEWebKit

WPE WebKit port (downstream)
213 stars 136 forks source link

[wpe-2.28][wpe-2.38] Excessive compositing when `-webkit-mask-image` used #1110

Closed Scony closed 1 year ago

Scony commented 1 year ago

In the Disney+'s HTML application running with high profile animations the interesting scenario has been spotted - its simplified version is captured in the small demo: https://scony.github.io/stb-lab/expensive-compositing-3/index.html The issue is, the -webkit-mask-image property forces full-div compositing. The worst-case scenario is, such compositing is forced to change textures e.g. due to size changes (as in demo above) triggered by animation.

The result is as follows:

(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 2704320
(...)
BitmapTexturePool::acquireTexture(1264x425), create: 1
LayerTreeHost::layerFlushTimerFired()
BitmapTexturePool::acquireTexture(1264x427), create: 1
LayerTreeHost::layerFlushTimerFired()
BitmapTexturePool::acquireTexture(1264x429), create: 1
LayerTreeHost::layerFlushTimerFired()
BitmapTexturePool::acquireTexture(1264x432), create: 1
(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 38235488
(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 62564960
(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 93622656
(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 106520840
(...)
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 106520840
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 106520840
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 105219848
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 69157800
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 44828328
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 13770632
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 1847936
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 947968
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 947968
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 947968
BitmapTexturePool::releaseUnusedTexturesTimerFired(), storedPixelsNumber: 947968

Log: expensive-compositing-3.log

As one can see, the above scenario leads to explosion of number of textures in the BitmapTexturePool. If the textures are significantly large, this in turn poses a problem for the GPU memory.

As for the simple workaround, the mechanism introduced in https://github.com/WebPlatformForEmbedded/WPEWebKit/pull/1108 can be used, but in general it should be re-assessed whether compositing should be forced in the scenarios like above.

CC @emutavchi @magomez

magomez commented 1 year ago

The problem here is not -webkit-mask-image propery, but the banner_animation applied to the banner div.

Everytime the banner div size is changed, its content needs to be redrawn, which means allocating new tiles to cover the whole content, rendering to them, and then move them to the compositor, where those tiles are uploaded BitmapTextures. And the BitmapTextures used for the previous tiles are freed, going to the pool. In the log you see a lot of textures with sizes 512x429, then 512x432, then 512x435, etc (or 240x429, 240x432, 240x435, etc). Those are the tiles used for the banner div. As the size of the element is increasing, the old tiles become too small with each step of the animation so they can't be reused from the poll, and remain there until they are freed.

Then there's the banner_content element. That one, due to the mask image, is backed by a single buffer with the size of the element. But as its size is the size of the container, which is banner, it means that its size will grow with the animation as well. As in the previous case, this means allocating a new buffer for each step of the animation, rendering the content, moving that content to a BitmapTexure, and release the previous BitmapTexture. As in the previous case, this texture is too small to be reused again for the same element, and there are no other components that require a texture with that size, so it stays in the pool until it's released. In the log, the textures used are the ones with sizes 1264x429, 1264x432, 1264x435, etc.

The excessive memory consumption doesn't come from the composition. It comes for the cache of BitmapTextures that we have, that in this case is storing a lot of textures that are not used anymore. And because there's no limit in the pool size, and that the textures take 3 seconds to be released, we accumulate a lot of them in the pool because the animation is using a lot of them. So the proper fix to this is limiting the texture pool somehow. Either freeing textures faster or limiting the amount of textures it can hold. The fix you added in https://github.com/WebPlatformForEmbedded/WPEWebKit/pull/1108 should be enough to mitigate this. But this doesn't require any change in how we perform the composition for animations.

Scony commented 1 year ago

Well, technically I agree with all you've wrote, but if compositing would be disabled in the above case we wouldn't even reach the point where we're allocating tons of BitmapTextures. Isn't it possible/reasonable to disable compositing in cases (like above) where we're allocating and invalidating textures which are pretty close to the full-screen size?

magomez commented 1 year ago

Compositing can't be disabled. I guess what you mean is not giving its own layer with a backingStore to the banner div and its children. In other words, when we have an animation that requires constantly resizing and re-rendering the content of the component, don't give it its own layer. Which means, render its contents to the parent's buffer (which in the example would be the root's layer backingStore). In this case the parent's backingStore is big enough to contain the full size of the element, so we wouldn't have to be creating and destroying tiles with different sizes, they would be reused from the pool (the root layer size is constant). But then, for each step of the animation, we need to re-render the content of the banner div, the banner_content div and all the components that are also rendering to the root layer. In the example there aren't many, but in a normal page there can be lots. And this rendering happens with cairo, in the CPU, and it's slow. Precisely the advantage of compositing is that you only need to redraw the content of a single layer, and not all of the elements in the page that are contributing to a single pixel. Then, getting out of the example, let's generalize a bit. In the example the parent layer of the banner div is the root layer. But what if it's not? What if the size is smaller than the animated element? or maybe it's bigger but as the animated element grows, it becomes too small? Then we need to start reallocating and re-rendering o parent layer, getting exactly into the same problem we had, but now we are re-rendering the contents of all the elements that contribute to that parent layer instead of just the animated one. If you keep going up in the hierarchy until you reach the root layer, then you have a layer that may be big enough for the animated element (or may not), but you will be rendering all the elements of page that contribute the damaged pixels every frame. And that rendering done with cairo is the bottleneck of the whole process. The more you can avoid it, the better the performance will be.

Summarizing: compositing is an optimization to reduce the amount of CPU renderings that need to be done, because that's by far the most expensive task required to show the content. And it's done at the expense of using more memory to store the intermediate buffers. And we know that. And everything that implies reducing the composition to save memory has a big impact in the performance, so that's not an option for us. Specially in this case where the problem is not that we're consuming much memory for the composition, just that we're caching too much, and the fix is just limiting the cache.

magomez commented 1 year ago

You have a better explanation about how this works and why in https://wpewebkit.org/blog/03-wpe-graphics-architecture.html in case you're interested.

Scony commented 1 year ago

Thanks, sounds like https://github.com/WebPlatformForEmbedded/WPEWebKit/pull/1108 should be enough then.