servo / webrender

A GPU-based renderer for the web
https://doc.servo.org/webrender/
Mozilla Public License 2.0
3.12k stars 277 forks source link

Rotated clip masks in MotionMark #2084

Open kvark opened 6 years ago

kvark commented 6 years ago

MotionMark first test features hundreds of clipped rectangles rotated differently (also see #2083). Each produces a separate clip mask to be rendered, which is inefficient.

It would be best to recognize the transformed property here and generate a single mask in local space that is shared between all the instances. This would require the transformed shaders to be aware of the space the mask is provided in.

01-rotated-mask

glennw commented 6 years ago

Interesting idea! I wonder if the anti-aliasing would be affected in a bad way though by doing this?

glennw commented 6 years ago

Another thing to consider for this test case - we don't currently share these clip masks at all, even if they were the same. I wonder if we were to quantize the parameters to device pixel amounts, and share them, if we'd drop the number of masks quite significantly here?

pcwalton commented 6 years ago

Another possibility would be to recognize the special case of a solid color with a rounded clip and change the clip into a plain old border display item.

Ideally, to make this optimization better for real-world use cases (i.e. not just tailored to this benchmark) it might be nice to do it after primitive segmentation. This would allow it to, for example, eliminate the clips from large rounded flat colored buttons, which are common in flat design.

glennw commented 6 years ago

That's an interesting idea - but there's an even better solution I think - extend the mask shader (used above) to support a vertex color. Then, in this case we just draw a colored mask directly onto the surface and skip intermediate surfaces altogether.

kvark commented 6 years ago

After talking to @glennw I admit we can't have local clip masks (because of AA), and the last proposal (using clip_rectangle for rendering into the color framebuffer) seems reasonable hack :)

pcwalton commented 6 years ago

Clips cause similar problems in the bouncing gradient circles:

problem

glennw commented 6 years ago

@pcwalton I think (yet to be confirmed) that the test case above is also drawing a heap of redundant rectangular clip masks, which might explain part of the problem too.

pcwalton commented 6 years ago

Note that if the clip mask is a circle it can be reused regardless of rotation, because it's, well, a circle. Optimizing this might help that benchmark (but might not help real-world sites).

Also, I checked to see whether the problem was the ellipse shader by commenting it out. This improved performance by 50% or so. So while optimizing the ellipse shader would certainly help, it's far too slow for the ellipse border shader to fully explain the issue. It might be needless rectangular clip mask generation, as you suggested.

glennw commented 6 years ago

I'll take a look at this test case using the fix mentioned here https://github.com/servo/webrender/issues/1648#issuecomment-346511425 and investigate from there.

glennw commented 6 years ago

OK, this test looks fairly reasonable with the removal of the redundant rectangular clip masks. Could still do with improvement but it's not bad now.

circles

glennw commented 6 years ago

The GPU usages sits at around 4-8 ms for me when just drawing the required clips.

pcwalton commented 6 years ago

I just retested. With 1000 circles I now get ~30 FPS in Servo+WR+slow style hack and 37 FPS in Chrome. Most of the GPU time is spent in clips, unsurprisingly. CPU time outweighs GPU time.

Ways we could make up the remaining difference:

  1. Only draw fully circular clips of the same size once. (Hacky and not very general, but will be an enormous win.)
  2. Have a fast path for circular clips.
  3. Tessellate clips.
  4. Fix whatever is causing the slow CPU usage.

I would guess that doing any one of these will make up the difference on its own.

glennw commented 6 years ago

@pcwalton Is that with https://github.com/servo/webrender/pull/2104 applied (it's not in Servo yet)? It might not make any difference here, but it may be quite significant. What CPU/GPU times are you seeing in WR?

pcwalton commented 6 years ago

@glennw Yes, it's with #2104 applied.

Screenshot

glennw commented 6 years ago

@pcwalton Cool, those numbers look reasonable-ish for that test. I'll do some tests on having a fast clip path for uniform radii, that's probably the next easy win there.