Render Graphs - Working with Render Passes, Images, Framebuffers and Graphics Pipelines

mitchmindtree commented 5 years ago

I've been doing a lot of thinking on #224, which has led to a lot of thinking on #214, how to make subpasses work and how to simplify all these different components and the ways they interoperate. It's been tricky to keep track of all the subtle interactions and to come up with APIs that actually simplify all cases and not just some subset. I think this is because when thinking about individual use cases, we end up thinking about one small subset of what is really a large and potentially complex graph.

For example, when thinking of the simple vk_triangle.rs example, we have a very simple linear flow where we start with some geometry, plug it into a graphics pipeline and render it to the window's swapchain image. If we want to add MSAA, then we need to insert an intermediary multisampled image that the graphics pipeline can render to that will then get resolved to the swapchain image. If we wanted to save the resolved image, we would need to add a second StorageImage image output along with a new subpass that describes this step. If we picture these steps as building a graph, they might look like this:

vk_triangle.rs

geometry ---[graphics pipeline]---> swapchain image

vk_triangle.rs with MSAA

geometry ---[graphics pipeline]---> multisampled image ---[resolve]---> swapchain image

vk_triangle.rs with MSAA and saving to CPU accessible image

geometry ---[graphics pipeline]---> multisampled image ---[resolve]---> swapchain image
                                                       \
                                                        \
                                                         -[resolve]---> storage image

If we think of the geometry and images as Nodes and the processes that write or transform the data as Edges, the above might look like this:

Node ---[Edge]---> Node

Node ---[Edge]---> Node ---[Edge]---> Node

Node ---[Edge]---> Node ---[Edge]---> Node
                        \
                         \
                          -[Edge]---> Node

Perhaps it's worth thinking about the possibility of creating a RenderGraph or RenderDag type that reflects this way of thinking? I imagine the API would be similar to something like daggy where you can do things like dag.add_node(img), dag.add_edge(geom, img, graphics_pipeline), etc.

Render Pass - Describing the Graph

In the Vulkan API, the Render Pass is used to describe this directed acyclic graph to the GPU driver. The driver implementation is then free to process branches in parallel/concurrently as long as the dependency order is maintained.

Vulkano's graph API is extremely flexible in the sense that, as long as you implement the RenderPassDesc trait for a type and check for all the required invariants, that type may be used with a framebuffer as the render pass description. However, vulkano itself doesn't really provide any useful implementations of this trait out of the box.

Instead, it provides macros (single_pass_renderpass!, ordered_passes_renderpass!) that allow the user to generate types that implement RenderPassDesc at compile time. The benefit to this approach is that the generated types can be checked for correctness (to some extent) at compile time. The downside to this is that it's very difficult to setup more dynamic render passes that change at runtime, for example that might change from Clearing the background to Loading the original contents of an image, or from switching between multisampling and no multisampling. Using the vulkano macros, the only option for achieving this right now is to have multiple copies of your renderpass code with the subtle variations that you need and to switch between them as necessary at runtime. This obviously becomes exponentially less practical as more and more options are required.

Maybe the best place to start with all this is to create a dynamic render pass description type that achieves the same role as the vulkano macro generated types but allows for this runtime flexibility, moving the correctness checks from compile time to runtime. I have a local branch that begins this process - I might open a PR soon to continue discussing it there.

mitchmindtree commented 5 years ago

I've been doing a little more reading on the restrictions that render passes impose and I think it's likely we would end up with two kinds of graphs.

Graph 1. Render Pass Description

The render pass description allows for describing a series of subpasses that might form a DAG, however these are limited in a few ways.

Limitations

The dimensions of each image attachment within a single render pass all must match.
An image attachment that is written to in one subpass may not be accessed "globally" by following subpasses - only the pixel at the same location may be accessed. This means we can't do things like blurs, glows or ambient occlusion within a single render pass. We can however do things like deferred lighting, MSAA, hue-shifting, etc as these only operate on one pixel of the subpass.

As a result of these limitations, many renderers can operate far more efficiently as bounds checks may be omitted, tiled renderers can calculate optimal tiling sizes for all attachments beforehand, etc. As a result, it's useful to do whatever processing you can in subpasses, rather than using a new render pass for every step.

Graph 2. "General" Render Graph

The idea of a more general render graph would be to allow for composing together multiple render passes in a more flexible fashion. This would allow us to more easily do things like blurs, ping-pong aka double-buffering, use the results of previous rendering stages as texture inputs for shaders in following stages, etc.

In other words, this "general" render graph (graph 2) would be a more "zoomed out" generalised graph whose nodes are themselves render pass descriptions (graph 1).

I might try and sketch out some ideas for the different kinds of Node and Edge variants that each of these two different graph types would contain to get a slightly clearer idea of how this would all map together in code.

freesig commented 5 years ago

Just reading up on vulkan sub passes and I think a DAG is a good way to represent this. Could it be as simple as each node being a PassDescription?

freesig commented 5 years ago

Edges might be the changes to the attachment ie move from preserve to color. As well as PassDependencyDescription

freesig commented 5 years ago

How would we share attachments between renderpasses? It looks like you have a framebuffer per renderpass. So would you share the attachments between framebuffers?

freesig commented 5 years ago

Another thought about this is if possible we probably don't want to move all the checks to runtime if they don't have to. Or atleast encourage the renderpasses to be built in the model function so they fail at initialization of the program. Otherwise we might end up with nannou applications that panic at runtime when it might have been possible to catch the errors at compile time.

mitchmindtree commented 5 years ago

Yeah exactly, I'm imagining the PassDescriptions as nodes and PassDescriptionDependencys as edges.

So far I've got a gpu::render_pass::Description type that looks like this:

/// A dynamic representation of a render pass description.
///
/// `vulkano` itself provides a `RenderPassDesc` trait that allows for implementing custom render
/// pass description types. While `vulkano` provides the `single_pass_renderpass!` and
/// `ordered_passes_renderpass!` macros, these generate fixed types and do not allow for changing
/// individual values at runtime.
#[derive(Debug)]
pub struct Description {
    attachment_descriptions: Vec<AttachmentDescription>,
    subpass_descriptions: Vec<PassDescription>,
    dependency_descriptions: Vec<PassDependencyDescription>,
}

And a massive function that validates the set of descriptions like this:

/// Checks the validity of each of the given description lists.
pub fn validate_descriptions(
    attachment_descriptions: &[AttachmentDescription],
    subpass_descriptions: &[PassDescription],
    dependency_descriptions: &[PassDependencyDescription],
) -> Result<(), InvalidDescriptionError> {
    // ...
}

Where InvalidDescriptionError has a large list of variants of errors associated with specific invariants that should be met. This is what it looks like so far:

/// The error returned by `validate_descriptions`.
#[derive(Debug)]
pub enum InvalidDescriptionError {
    /// Two color/depth/stencil attachments within a single subpass had differing samples.
    InvalidSamples {
        subpass_idx: usize,
        attachment_a_idx: usize,
        attachment_a_samples: u32,
        attachment_b_idx: usize,
        attachment_b_samples: u32,
    },
    /// The subpass contained an invalid index to an attachment.
    SubpassInvalidAttachmentIndex {
        subpass_idx: usize,
        invalid_attachment_idx: usize,
    },
    /// Although the same attachment was referenced in both the `color_attachments`/`depth_stencil`
    /// field and `input_attachment` fields, their `ImageLayout`s differed.
    SubpassInvalidImageLayout {
        subpass_idx: usize,
        attachment_idx: usize,
        layout: ImageLayout,
        input_layout: ImageLayout,
    },
    /// A preserve attachment was found that was contained in one of the other members.
    SubpassInvalidPreserveAttachment {
        subpass_idx: usize,
        attachment_idx: usize,
    },
    /// A resolve attachment had a number of samples specified that was greater than one.
    SubpassInvalidResolveAttachmentSamples {
        subpass_idx: usize,
        attachment_idx: usize,
        attachment_samples: u32,
    },
    /// A color attachment had a `samples` value of 1 or 0 even though a resolve attachment was
    /// included.
    SubpassInvalidColorAttachmentSamples {
        subpass_idx: usize,
        attachment_idx: usize,
        attachment_samples: u32,
    },
    /// The subpass contained one or more resolve attachments and there was a mismatch between one
    /// of the resolve and color attachment formats.
    SubpassInvalidAttachmentFormat {
        subpass_idx: usize,
        attachment_a_idx: usize,
        attachment_a_format: Format,
        attachment_b_idx: usize,
        attachment_b_format: Format,
    },
}

As the PassDependencyDescription already describes the edges with source_subpass and destination_subpass indices, this Description type basically creates a graph as is. That said, one of the things we'll need to validate is that there isn't a cycle in dependencies. This can be checked by attempting to do a toposort of the graph and petgraph has a nice function which already does this. I think either we should see if we can implement the necessary graph traits for Description so that we can pass it to the is_cyclic function directly, or we just create a petgraph::Graph from the PassDependencyDescriptions and then pass that to the is_cyclic function (probably the easier approach).

How would we share attachments between renderpasses? It looks like you have a framebuffer per renderpass. So would you share the attachments between framebuffers?

The render passes themselves just require a description of the attachment and don't yet work with specific attachments. Not until we build the framebuffer do we need the exact attachments that will be used. This is why when the swapchain and images need to be recreated due to a resize, we only need to recreate the framebuffer but not the renderpass. When the renderpass refers to attachments this always refers to an "image" of some kind (e.g. maybe a transient image attachment for multisampled colour then a swapchain image to resolve to).

Vulkano always returns created images behind an Arc (e.g. see AttachmentImage::new) so as a result it's easy enough to use the same image within multiple render passes by cloning the Arc. However, while the Arc makes it easy to share around a single image, it does make it difficult to replace or recreate an image when it is shared in multiple places, as we would have to keep track of everywhere that it's used and then replace each of those Arcs with our newly created one. When we create the more general render graph, it might be more useful to have a single collection containing each of the unique images and refer to them by an index or key of some sort - that way when we need to recreate the images (due to resize or something) we can just recreate the one instance and not have to replace the image throughout the graphs.

freesig commented 5 years ago

Yep looks good

so that we can pass it to the is_cyclic function

You could also just assert that source_subpass_idx < dest_subpass_idx that way you couldn't create a cycle.

When we create the more general render graph, it might be more useful to have a single collection containing each of the unique images and refer to them by an index or key of some sort.

This could work or I guess you could use an Arc<Mutex<AttachmentImage>> and then just update the AttachmentImage that they are all pointing at. I imagine if you are handing out an index to an array and swapping out the image under the index that you would still need a mutex anyway.

mitchmindtree commented 5 years ago

You could also just assert that source_subpass_idx < dest_subpass_idx that way you couldn't create a cycle.

Yeah good point! I guess this would just mean that for render passes that aren't just linear and have a bunch of branches, users would always have to order their subpass descriptions in topological order which you kind of do intuitively anyways.

I guess you could use an Arc<Mutex<AttachmentImage>>

The issue is that vulkano itself returns Arc<ImageType> from all of the different image constructors, and we can't really force a mutex in there. However, using Arc<Mutex<Arc<AttachmentImage>>> is definitely an option though.

I imagine if you are handing out an index to an array and swapping out the image under the index that you would still need a mutex anyway

By using indices, there would only be one actual collection (eg. a Vec<Arc<Image>>) of all of the images and in turn only one owner meaning that you only need &mut access to that Vec in order to swap the Arc<Image> out for a new one - I don't see why we'd need a Mutex in this case? On the other hand if you're sharing the image between all the nodes in which it is referred to via an Arc, then that's when you would need a Mutex<Arc<Image>> to swap out the image as Arc only provides immutable access to its inner type.

freesig commented 5 years ago

Hey I just saw this in the spec:

srcSubpass must be less than or equal to dstSubpass, unless one of them is VK_SUBPASS_EXTERNAL, to avoid cyclic dependencies and ensure a valid execution order

So it might be best to assert the srcSubpass <= dstSubpass anyway

mitchmindtree commented 5 years ago

Ahh wicked, good find!

mitchmindtree commented 5 years ago

Just a note to check out rendy's render graph implementation before diving into this - the description is very close to what has been discussed in this issue already.

nannou-org / nannou