google / forma

An efficient vector-graphics renderer
Apache License 2.0
2.62k stars 51 forks source link

Is "Update only the tiles that change (currently CPU-only)" really desired for the GPU? #33

Open danielzgtg opened 1 year ago

danielzgtg commented 1 year ago

There is "Update only the tiles that change (currently CPU-only)" in the README. That suggests that it is being considered that this should be ported to the GPU and such a port is planned. I question this because it goes against my entire mental model of GPU performance.

Supporting this idea is the convention for modern rendering software to just repaint everything. Every 3D game redraws the whole world onto a new frame, throwing away the previous one. Desktop environments and windowing systems redraw the whole screen to support modern effects and decorations. With their introduction of compositing, their equivalent of "update only the tiles that change" called "damage" is gone as every application gets their own buffer and modern software does not need to consider this anymore. I read how modern GPUs are actually faster when data is not reused for partial repaints. One form of reuse slowness is trying to draw something then discarding, our "update only the tiles that change" would be equivalent to discarding most of the screen. From Asahi Linux's GPU blog, it is stated that it is expensive to read framebuffer data back for reuse compared to rerendering and overwriting it. Then there is the problem of trying to render the next frame on the GPU right after the last one is finished without waiting for the CPU-GPU round-trip synchronization time. These all support the idea that GPUs like data pushed through without the latency of data dependencies, and reuse will only impair parallel processing to slow down the GPU with dependencies.

What might save this idea is battery consumption. Firefox goes through the trouble of using private APIs on macOS so that it reuses unchanged frame data. On the README however I see that the GIFs are about gaming so I don't know how much battery matters compared to Firefox's text use case.

So I would like people to teach me, what are your considerations, viewpoints, and experiences for GPU framebuffer reuse?

dragostis commented 1 year ago

Thank you for the thoughtful writeup.

That suggests that it is being considered

I'm currently considering experimenting with this in the future, but we should first focus on making the GPU run well on all important platforms.

"update only the tiles that change" called "damage"

Maybe it's a good idea to just call this damage. I expected the other name to be more beginner-friendly, but it seems the opposite is true.

I read how modern GPUs are actually faster when data is not reused for partial repaints

My intuition is that this would be true in our case as well, even though we're painting in compute shaders. Trying to figure out while painting whether or not a tile can be skipped translates to a decent amount of jump instructions. These are expensive on CPUs and especially expensive on GPUs.

it is expensive to read framebuffer data back for reuse

forma does not actually need to do this in order to perform damage regions. It keeps track of the previous framebuffer in a compact buffer. See CPU implementation.

without waiting for the CPU-GPU round-trip synchronization time

This is also something we're trying to do: forma has as explicit goal being able to dispatch the entire GPU work in one go so it doesn't need to re-synchronize with the CPU mid-frame.

What might save this idea is battery consumption.

Good intuition! This is exactly what I had in mind here: while trying to skip the tile might prove a bit slower than usual rendering, avoiding the extra blit of a large framebuffer might prove to be a great win in terms of energy usage.

A feature like this would ideally be benchmarked both in terms of speed and battery efficiency and the end-user should be able to decide whether or not to enable the damage regions. (this is also possible on CPU)