troughton / Substrate

A cross-platform render-graph based rendering system written in Swift
MIT License
148 stars 11 forks source link

Questions and sample demo #2

Open OskarGroth opened 1 year ago

OskarGroth commented 1 year ago

Hi! First off, huge props for publishing this awesome library. It is one of the most complete (and up-to-date) real-world example of a Metal rendering architecture that I've seen đź‘Ź

It is a shame that there is no simple example demo (the ImGui demo is out of date, and I'd like to see some different examples of passes). I wanted to try and see if I could implement the Xcode Metal Game starter project (rotating cube) using Substrate. This is what I got:

    func render(in drawable: CAMetalDrawable) {
        let pass = TestDrawPass(uniforms: uniforms, outputTexture: Texture(metalTexture: drawable.texture))
        renderGraph.addPass(pass)

        Task {
            await renderGraph.execute()
            drawable.present()
            inFlightSemaphore.signal()
            isRendering = false
        }
    }

    func draw(in view: MTKView) {
        guard !isRendering else { return }
        isRendering = true

//        autoreleasepool {
            guard let drawable = view.currentDrawable else { return }
            self.render(in: drawable)
//        }
    }

TestDrawPass is not implemented yet, (execute is empty), but I already ran into an issue: CPU usage keeps climbing indefinitely until the app stalls:

Screenshot 2022-09-23 at 20 04 10

I tried to profile it, but the problem is not exhibiting while profiling. This is on Xcode 14, Beta 5 on macOS Ventura.

Is there an issue with my setup so far? It is a bit unclear on how to set up the texture for the CADrawable manually, without going through the AppFramework (which imposes too much in terms of view and window management).

Do you think you could share a few more examples of Render Passes for drawing a simple instanced mesh? A sample project with a demo renderer would be so helpful.

I'd also like to know if there is any way to use Substrate in a static mode, where the graph is manually rebuilt rather than on every frame. This would make it more viable for use cases where the resources and passes do not change very often.

troughton commented 1 year ago

Hi Oskar,

Thanks for the interest, and apologies for the lack of documentation. I'll try to go one by one on your questions.

I've just fixed up the ImGuiDemo, so that should now be working on the latest main.

In terms of your setup: just from looking at that code, I think that should work, and I'm not sure what's causing the CPU usage spike – the fact that it disappears with optimisations on is interesting. Regardless, it's not what I'd recommend. There are a couple of main issues;

Examples for drawing a simple instanced mesh: more rendering examples have been on my to-do list for a long time now, but I haven't gotten around to them due to a mixture of time constraints and design changes I want to make; I haven't got anything other than the ImGuiDemo that's really in a working shareable state. Probably the closest to what you're asking for is a debug draw pass; it might require some adjustments, but https://gist.github.com/troughton/7697e03345d06717de022a944ce54a1c is the render pass itself, https://gist.github.com/troughton/911ab0737a1e6707ef7f57d82558db5b is the DebugDraw library, and https://gist.github.com/troughton/7111a56ffe084bb8c1b796180656d3ce is the shader code.

I haven't considered a static mode for the reason that generally, things are changing between frames (otherwise why re-render), and attempting reuse doesn't tend to work out in practice because minor changes can cascade. The good news is that rebuilding the graph can be very inexpensive, so I haven't found rebuilding each frame to be an issue.

There is one caveat with that, and it links into the aforementioned design changes: the current implementation can run into performance issues because the render graph tracks more resources than it needs to each frame. For example, it currently tracks usages for every texture referenced within an argument buffer in a render pass just in case a later pass happens to write to that same texture, and that can become slow with bindless-style setups (even though the resource tracking itself has been heavily optimised). My plans for the next version of Substrate involve having the render graph do less work in that regard; it'll rely more on the user explicitly declaring resource dependencies for each pass, and the render pass execute methods will directly call into the underlying render API rather than being recorded into a command list and executed later. This is actually implemented on the immediate-mode branch, but I haven't yet had the chance to fully shift my code over to that design – it's likely I'll want to make further changes as I test it. I guess the summary of this is a warning: you can use the released version of Substrate and it should work well (it's being used in production on a soon-to-be-released Mac app), but there are likely significant source-incompatible design changes coming in the next major version.

Hopefully that answers at least some of your questions; feel free to follow up if anything's unclear.

troughton commented 1 year ago

And as I’ve thought about it more: I think you have a timing issue with your drawable.present() call. When renderGraph.execute() is called, the command buffers will have been submitted to the Metal driver but may not have been scheduled and almost certainly won’t have completed. If you call renderGraph.execute().wait() it will wait until the render has completed on the GPU, which should mean it’s safe to call present; the GPU-to-CPU sync here will reduce throughput, though. Alternatively, you can use the method I described above to let the render graph handle calling present (it will call MTLCommandBuffer.present() in its implementation).