NVIDIAGameWorks / NRDSample

104 stars 20 forks source link

Implementing NRI/NRD command context #22

Open StudenteChamp2 opened 2 months ago

StudenteChamp2 commented 2 months ago

I am adding real-time denoising support to my CPU(Yes CPU) path tracer. I already have Optix denoising running and it works like a charm. Now i need a solution for AMD GPUs. I chose NRD. I have a budget of 0.5 sec/ frame to perform denoising.

How to properly execute an NRI/NRD command context? Basically i have a native DX12 command context:

    struct CommandContext
    {
        IDXGIFactory4* factory;
        ID3D12Device* device;
        ID3D12CommandAllocator* commandAllocator;
        ID3D12GraphicsCommandList* commandList;
        ID3D12CommandQueue* commandQueue;
        ID3D12Fence* fence;
        HANDLE fenceEvent;
        UINT64 fenceValue;
    } m_cmdContext;

I use that native context to create an NRI/NRD command context

void NvidiaNRDDenoiser::createNRDContext(nbUint32 imageWidth, nbUint32 imageHeight)
{
    m_NRD = new NrdIntegration(1, false, "NvidiaNRDDenoiser NrdIntegration");

    //=======================================================================================================
    // INITIALIZATION - WRAP NATIVE DEVICE
    //=======================================================================================================

    // Wrap the device
    nri::DeviceCreationD3D12Desc deviceDesc = {};
    deviceDesc.d3d12Device = m_cmdContext.device;
    deviceDesc.d3d12GraphicsQueue = m_cmdContext.commandQueue;

    #if defined (_DEBUG)
    deviceDesc.enableNRIValidation = true;
    #else
    deviceDesc.enableNRIValidation = false;
    #endif
    nri::Result nriResult = nri::nriCreateDeviceFromD3D12Device(deviceDesc, m_nriDevice);

    // Get core functionality
    nriResult = nri::nriGetInterface(*m_nriDevice,
        NRI_INTERFACE(nri::CoreInterface), (nri::CoreInterface*)&m_NRI);

    nriResult = nri::nriGetInterface(*m_nriDevice,
        NRI_INTERFACE(nri::HelperInterface), (nri::HelperInterface*)&m_NRI);

    // Get appropriate "wrapper" extension (XXX - can be D3D11, D3D12 or VULKAN)
    nriResult = nri::nriGetInterface(*m_nriDevice,
        NRI_INTERFACE(nri::WrapperD3D12Interface), (nri::WrapperD3D12Interface*)&m_NRI);

    //=======================================================================================================
    // INITIALIZATION - INITIALIZE NRD
    //=======================================================================================================

    const nrd::DenoiserDesc denoiserDescs[] =
    {
        { NRD_ID(REBLUR_DIFFUSE_SPECULAR), nrd::Denoiser::REBLUR_DIFFUSE_SPECULAR },
    };

    nrd::InstanceCreationDesc instanceCreationDesc = {};
    instanceCreationDesc.denoisers = denoiserDescs;
    instanceCreationDesc.denoisersNum = _countof(denoiserDescs);

    // NRD itself is flexible and supports any kind of dynamic resolution scaling, but NRD INTEGRATION pre-
    // allocates resources with statically defined dimensions. DRS is only supported by adjusting the viewport
    // via "CommonSettings::rectSize"
    bool result = m_NRD->Initialize((uint16_t)imageWidth, (uint16_t)imageHeight, instanceCreationDesc, *m_nriDevice, m_NRI, m_NRI);
    ASSERT(result);

    //=======================================================================================================
    // INITIALIZATION or RENDER - WRAP NATIVE POINTERS
    //=======================================================================================================

    // Wrap the command buffer
    nri::CommandBufferD3D12Desc commandBufferDesc = {};
    commandBufferDesc.d3d12CommandList = m_cmdContext.commandList;

    // Not needed for NRD integration layer, but needed for NRI validation layer
    commandBufferDesc.d3d12CommandAllocator = m_cmdContext.commandAllocator;

    m_NRI.CreateCommandBufferD3D12(*m_nriDevice, commandBufferDesc, m_nriCommandBuffer);
}

Now I need to implement denoising. Execution should look like this:

startCommandRecording()
PerformDensoising()
EndCommandRecording()
WaitForGPU()

So i Need to implement startCommandRecording, EndCommandRecording, and WaitForGPU for NRI

With my native DX12 context it would look like this:

void NvidiaNRDDenoiser::startCommandRecording()
{
    HRESULT hr = m_cmdContext.commandAllocator->Reset();
    ASSERT(SUCCEEDED(hr));

    hr = m_cmdContext.commandList->Reset(m_cmdContext.commandAllocator, nullptr);
    ASSERT(SUCCEEDED(hr));
}

void NvidiaNRDDenoiser::endCommandRecording()
{
    HRESULT hr = m_cmdContext.commandList->Close();
    ASSERT(SUCCEEDED(hr));

    // Execute the command lists
    ID3D12CommandList* ppCommandLists[] = { m_cmdContext.commandList };
    m_cmdContext.commandQueue->ExecuteCommandLists(_countof(ppCommandLists), ppCommandLists);
}

void NvidiaNRDDenoiser::WaitForGPU()
{
    const UINT64 fence = m_cmdContext.fenceValue;
    HRESULT hr = m_cmdContext.commandQueue->Signal(m_cmdContext.fence, fence);

    // Wait until GPU finish with command queue.
    if (m_cmdContext.fence->GetCompletedValue() < fence)
    {
        hr = m_cmdContext.fence->SetEventOnCompletion(fence, m_cmdContext.fenceEvent);
        ASSERT(SUCCEEDED(hr));

        WaitForSingleObject(m_cmdContext.fenceEvent, INFINITE);
    }

    ++m_cmdContext.fenceValue;
}

How to implement these 3 for NRI?

Thanks for helping!

dzhdanNV commented 2 months ago

If you wrap d3d12CommandList then you have access to BeginCommandBuffer/EndCommandBuffer from NRI Core interface. You can additionally wrap d3d12commandQueue and get access to QueueSubmit from NRI Core interface and WaitForIdle from NRI Helper interface. But I don't think that it's needed. Just use your native code. NRI is needed only to wrap d3d12CommandList and invoke Denoise.

StudenteChamp2 commented 2 months ago

Will give it a try ASAP thank you :)