WebRender is currently getting support for Pixel Local Storage GL extension. The idea is to take advantage of super-fast tile storage on mobile (and future Intel) devices. The Vulkan way of doing this would be with sub-passes.
WebRender frame consists of a number of passes, each with multiple layers of work. In order for the driver to figure out the best tile memory allocation/lifetime, it needs to know about all the dependent passes. So ideally, we'd just provide each slice of each participating texture as a framebuffer attachment, and the rendering to a slice would be turned into a separate sub-pass. We may very well end up with a hundred-ish attachments/sub-passes, hence the "uber-pass" name.
All of the intermediate render targets just need to be turned into transient attachments, since their lifetime is completely contained in the uber-pass. What is important for WebRender is to properly communicate the dependencies: we'll need to specify a flag to gfx-rs when an input attachment is sampled with precisely the same coordinate as the output - that would allow the driver to keep the intermediate value in the tile cache. Marking those dependencies, as well as re-arranging the work to make more of those, could be done incrementally after the main uber-pass is introduced.
There doesn't appear to be limits in Vulkan/gfx-rs preventing us to do this, and the driver can always fall back to flushing each individual sub-pass results to VRAM, as it would do on current desktop GPUs anyway.
WebRender is currently getting support for Pixel Local Storage GL extension. The idea is to take advantage of super-fast tile storage on mobile (and future Intel) devices. The Vulkan way of doing this would be with sub-passes.
WebRender frame consists of a number of passes, each with multiple layers of work. In order for the driver to figure out the best tile memory allocation/lifetime, it needs to know about all the dependent passes. So ideally, we'd just provide each slice of each participating texture as a framebuffer attachment, and the rendering to a slice would be turned into a separate sub-pass. We may very well end up with a hundred-ish attachments/sub-passes, hence the "uber-pass" name.
All of the intermediate render targets just need to be turned into transient attachments, since their lifetime is completely contained in the uber-pass. What is important for WebRender is to properly communicate the dependencies: we'll need to specify a flag to gfx-rs when an input attachment is sampled with precisely the same coordinate as the output - that would allow the driver to keep the intermediate value in the tile cache. Marking those dependencies, as well as re-arranging the work to make more of those, could be done incrementally after the main uber-pass is introduced.
There doesn't appear to be limits in Vulkan/gfx-rs preventing us to do this, and the driver can always fall back to flushing each individual sub-pass results to VRAM, as it would do on current desktop GPUs anyway.