Headless capture under UI control

How do I allow the UI to control capture for headless apps?

I found this documentation from #1074, but there seems to be no way to just indicate frame boundaries (analogous to present) and allow the UI to control capture. Is there a way to do this?

No, there's no way to do this. If you need to capture from a headless application then the in-application API is the way to go.

For headless applications many will not have a regular boundary, and/or will have different workloads at different times so it's more important to capture a specific region of work. The expectation is that for the larger majority of cases where the normal capture doesn't work out of the box, manually triggering the capture will be the best or only option.

The default UI-triggered capture mechanism relies on having a window handle to track which device is active so it would be a lot more work to provide enough information from the app through the API to get it working, vs. controlling the capture yourself with a little toggle or your own input. A wrapper function could be as simple as this:

RENDERDOC_API_1_0_0 *rdoc = NULL; // initialised elsewhere

void CustomFrameBoundary(bool capture)
{
  if(!rdoc) return;

  // assuming you only have one graphics device, NULL/NULL will work
  if(rdoc->IsFrameCapturing()) rdoc->EndFrameCapture(NULL, NULL);

  if(capture) rdoc->StartFrameCapture(NULL, NULL);
}

As that's (more-or-less) what a call to Present() will do. The bool capture parameter is a global that gets set whenever a trigger happens.

@baldurk is there a plan to support headless capture automatically for the major VR SDKs? It'd be nice if they just worked out of the box even when running headless (I'm debugging one such OpenXR app which is what prompted this question in the first place). Failing that, it'd be really nice to have a super-simple in-application API which allows the UI to trigger the frame capture, something like:

    rdoc->FrameBoundary();

where the internal implementation is as you've described above, but the "capture" parameter comes from the UI.

I have no plans currently to support any VR SDKs unless capturing is actively broken/impossible using one of them without any explicit support. That's only true for daydream that I know of - see #1008 - but even so that's very low priority. Oculus has added some support for their mobile platforms I believe but that's an outside contribution so I can't speak to what exactly it provdes, AFAIK capture works out of the box though.

From my experience generally VR programs can render a regular preview window as part of their frame around the time they submit a frame to the compositor, which allows capturing the whole frame that way. I know of a number of people who capture VR programs fine without any need for changes, either with that or with something else.

I'm against adding a FrameBoundary API like that. The UI managed capture process is oriented around actually having window handles such that there is an active window that's presented regularly. As I mentioned above it could end up as a leaky abstraction if you wanted to fake all that information through a call to the RenderDoc API, as I can't make any assumptions about what's happening the way an application can to simplify the case and it forever ties the internal implementation to how it's exposed in the API.

From the snippet above it's not complex for an application to use the RenderDoc API to capture, so I think it's better placed to put this responsibility on the application's side rather than trying to push it into the API. If you're able to use the UI to trigger a capture then you should equally be able to bind a button press or widget in your application to do the same, and as far as I can see that's all it gains you.

I disagree about it being a leaky abstraction (FrameBoundary() as a unit of work is very concrete).

I think headlesss VR apps will become much more common (as will a lot of other application areas like deep learning or simulation), so something like a FrameBoundary() would make adding instrumentation a lot simpler and less intrusive. Also not being able to use the UI to capture an instrumented app is a major usability pain-point IMO... my main loop might not have any event input at all. If you're really dead-set against it how about a way to get the capture condition from the UI? I think that'd be a lot messier but at least I could drive experiments (e.g. repeatedly capturing a specific frame#) from a GUI rather then guessing when to hit a hotkey.

The leaky abstraction is because exposing enough of the internals about how captures are triggered from the UI would tie in the implementation to that exposed API. If I change how captures are triggered in future then that API existing adds a restriction onto any alternative implementation which has to continue to behave exactly the same. Exposing the trigger event from the UI is equivalent and doesn't change that problem.

It is not uncommon for headless applications to have no swapchain to count regular frames against, but to my knowledge in there hasn't been a problem in using the RenderDoc API to control captures before, even in automated processes which don't take process input directly there can still be internal logic to trigger a capture in certain places.

I can't speak to every application's potential for inputs or controls but I still feel like this is the wrong place to solve the problem and I see many more disadvantages to advantages to exposing something like this in the API. Even if you are in an unfortunate situation where your application has significant difficulty taking any kind of input from an external source, the inconvenience in your case doesn't outweigh the problems it would cause in other cases and in the implementation itself.

Frame-based capture inherently needs a frame boundary and exposing an ability to mark that point under direct API control is worth the maintenance IMO, especially when you consider what it buys you in terms of functionality: for a minimally intrusive change in your source code, you can use the really-nice RenderDoc GUI to get work done.

Instead of what could be this simple...

#define USE_RENDERDOC

#if defined(USE_RENDERDOC)
    #include "renderdoc/renderdoc_app.h"
#endif

int main(int argc, char* argv[]) {
        #if defined(USE_RENDERDOC)
            pRENDERDOC_FrameBoundary RENDERDOC_FrameBoundary{};
            #if defined(_WIN32)
                if (HMODULE mod = GetModuleHandleA("renderdoc.dll"))
                {
                    RENDERDOC_FrameBoundary = (pRENDERDOC_FrameBoundary)GetProcAddress(mod, "RENDERDOC_FrameBoundary");
                }
            #else
                if (void *mod = dlopen("librenderdoc.so", RTLD_NOW | RTLD_NOLOAD))
                {
                    RENDERDOC_FrameBoundary = (pRENDERDOC_FrameBoundary)dlsym(mod, "RENDERDOC_FrameBoundary");
                }
            #endif
        #endif

        // Init graphics

        while (true) {
            program->PollEvents(&requestExit);
            if (requestExit) {
                break;
            }

            program->RenderFrame();

            #if defined(USE_RENDERDOC)
                RENDERDOC_FrameBoundary();
            #endif
        }
}

...here's more or less what it takes to bolt in RenderDoc currently:

#define USE_RENDERDOC

#if defined(USE_RENDERDOC)
    #include "renderdoc/renderdoc_app.h"
    static RENDERDOC_API_1_1_2* rdoc_api;
#endif

int main(int argc, char* argv[]) {
        #if defined(USE_RENDERDOC)
            pRENDERDOC_GetAPI RENDERDOC_GetAPI{};
            #if defined(_WIN32)
                if (HMODULE mod = GetModuleHandleA("renderdoc.dll"))
                {
                    RENDERDOC_GetAPI = (pRENDERDOC_GetAPI)GetProcAddress(mod, "RENDERDOC_GetAPI");
                }
            #else
                if (void *mod = dlopen("librenderdoc.so", RTLD_NOW | RTLD_NOLOAD))
                {
                    RENDERDOC_GetAPI = (pRENDERDOC_GetAPI)dlsym(mod, "RENDERDOC_GetAPI");
                }
            #endif
            RENDERDOC_DevicePointer rdoc_device{};
            RENDERDOC_WindowHandle rdoc_window{};
            if (RENDERDOC_GetAPI)
            {
                if (RENDERDOC_GetAPI(eRENDERDOC_API_Version_1_1_2, (void **)&rdoc_api))
                {
                    int major = 0, minor = 0, patch = 0;
                    rdoc_api->GetAPIVersion(&major, &minor, &patch);
                    Log::Write(Log::Level::Info, "Detected RenderDoc API " +
                        std::to_string(major) + "." + std::to_string(minor) + "." + std::to_string(patch));
                }
                else
                {
                    Log::Write(Log::Level::Warning, "Failed to load RenderDoc API");
                }
            }
        #endif

        // Init graphics

        #if defined(USE_RENDERDOC)
            if (rdoc_api)
            {
                const XrBaseInStructure* binding = graphicsPlugin->GetGraphicsBinding();

                switch (binding->type)
                {
                case XR_TYPE_GRAPHICS_BINDING_VULKAN_KHR:
                    rdoc_device = RENDERDOC_DEVICEPOINTER_FROM_VKINSTANCE(&((XrGraphicsBindingVulkanKHR*)binding)->instance);
                    break;
                #if defined(_WIN32)
                    case XR_TYPE_GRAPHICS_BINDING_OPENGL_WIN32_KHR:
                        rdoc_device = (RENDERDOC_DevicePointer)(((XrGraphicsBindingOpenGLWin32KHR*)binding)->hGLRC);
                        break;
                    case XR_TYPE_GRAPHICS_BINDING_D3D10_KHR:
                        rdoc_device = (RENDERDOC_DevicePointer)(((XrGraphicsBindingD3D10KHR*)binding)->device);
                        break;
                    case XR_TYPE_GRAPHICS_BINDING_D3D11_KHR:
                        rdoc_device = (RENDERDOC_DevicePointer)(((XrGraphicsBindingD3D11KHR*)binding)->device);
                        break;
                    case XR_TYPE_GRAPHICS_BINDING_D3D12_KHR:
                        rdoc_device = (RENDERDOC_DevicePointer)(((XrGraphicsBindingD3D12KHR*)binding)->device);
                        break;
                #endif // defined(_WIN32)
                default:
                    Log::Write(Log::Level::Info, "Unsupported renderdoc API");
                    break;
                }
                if (!rdoc_device)
                {
                    Log::Write(Log::Level::Error, "Failed to get renderdoc device");
                    rdoc_api->Shutdown();
                    rdoc_api = nullptr;
                }
                else
                {
                    Log::Write(Log::Level::Info, "Setting renderdoc device to " + std::to_string((uint64_t)rdoc_device));
                    rdoc_api->SetActiveWindow(rdoc_device, rdoc_window);
                }
            }
        #endif

        while (true) {
            program->PollEvents(&requestExit);
            if (requestExit) {
                break;
            }

            #if defined(USE_RENDERDOC)
                bool captureThisFrame = false;
                if (capturedFrame < captureFrame) {
                    ++capturedFrame;
                    captureThisFrame = true;
                }
                if (captureThisFrame && rdoc_api) {
                    Log::Write(Log::Level::Info, "Capturing frame " + std::to_string(captureFrame));
                    rdoc_api->StartFrameCapture(rdoc_device, rdoc_window);
                }
            #endif

            program->RenderFrame();

            #if defined(USE_RENDERDOC)
                if (captureThisFrame && rdoc_api) {
                    rdoc_api->EndFrameCapture(rdoc_device, rdoc_window);
                    Log::Write(Log::Level::Info, "Captured frame " + std::to_string(capturedFrame));
                }
            #endif
        }

        #if defined(USE_RENDERDOC)
            if (rdoc_api)
                rdoc_api->Shutdown();
        #endif
}

That's not an apples-to-apples comparison. In the first example you've assumed a function directly exported and not going through the RenderDoc API at all, and in the second example a lot of the extra code is calling SetActiveWindow which is redundant and invalid because the window is NULL.

Please bear in mind that your use case is not the only one that I have to account for and I have to look at the bigger picture. What is simplest and best for you doesn't necessarily account for all factors.

The RenderDoc API is designed to be small and functional, exposing only functionality that cannot be implemented by the application. In this case a helper function of a few lines can implement the functionality you want in a way that keeps the application complexity on the application side of the API.

baldurk / renderdoc

Headless capture under UI control #1155

How do I allow the UI to control capture for headless apps?