GPUOpen-LibrariesAndSDKs / RadeonImageFilter

https://gpuopen.com/radeon-prorender-suite/
Other
49 stars 15 forks source link

Quality degradation of AI denoising when albedo/normals/depth are backed by D3D12 resoruces #15

Open BartSiwek opened 2 years ago

BartSiwek commented 2 years ago

While trying to integrate the RIF AI denoiser into the existing D3D12 pipeline I have stumbled upon the issue where if the albedo/normals/depth RIF images are backed by a D3D12 resource (created using the rifContextCreateImageFromDirectX12Memory) the resulting image degrades in quality. At the same time if RIF images are not resource backed (created using the rifContextCreateImage) and the same contents is copied into them via the rifImageMap/rifImageUnmap the quality is good.

Example of good quality output: goof quality Example of bad quality output (note the acne like artifacts on the bottom part of the ball and close to the plane edges): bad quality

Notes:

givinar commented 2 years ago

Hi @BartSiwek , Thanks. We'll solve this problem.

givinar commented 2 years ago

Hi @BartSiwek, I'm unable to reproduce the problem at the moment. Could you explain some of the points? First of all, how do you get the resource memory pointer for albedo/normal/depth? I mean, are you using the DX12 capability (ID3D12Resource :: Map/Unmap) or the RIF methods (rifImageMap/rifImageUnmap). Further, I am afraid that I don't understand some of the points regarding "colorIng" and filter output.

Does this mean that to reproduce the problem, "colorImg" and filter output images must be created using the RIF API method (rifContextCreateImage)?

BartSiwek commented 2 years ago

Hi @givinar

In an ideal setup I never need to get the resource memory pointer for any of the inputs. I have D3D12 Texture Resources for all of them and I create the corresponding rifImage using the rifContextCreateImageFromDirectX12Memory. Doing so I noticed the acne like artifacts.

In the process of discovering what is causing the artifacts I switched to reading the D3D12 resources from the device back to the host memory using ID3D12Resource :: Map/Unmap, creating the rifImage's using the rifContextCreateImageFilter and providing them with data via the rifImageMap/rifImageUnmap. In this scenario acne does not appear.

My remark "This issue does not apply to the "colorImg" filter parameter - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rigImage I provide as the colorImg parameter can be either created via the rifContextCreateImageFromDirectX12Memory or via the rifContextCreateImageFilter followed by the rifImageMap/rifImageUnmap and that does not affect the output. This is not the case for albedo/depth/normals.

My remark "This issue does not apply to the filter output - as far as I noticed there in no quality degradation if this RIF image is D3D12 resource backed" - means that the rifImage used for the output can be either created with rifContextCreateImageFromDirectX12Memory or rifContextCreateImage and that does not affect the output.

Hope this helps.

givinar commented 2 years ago

Hi @BartSiwek May I ask you to send me some traces? To do this, you need to set your environment variables (RIF_TRACING_ENABLE to 1 and RIF_TRACING_PATH to ${SAVE_PATH}) image

After that, just run the filter twice. First, where is the bad quality (rifContextCreateImageFromDirectX12Memory). Second, with good quality (rifContextCreateImage). You should get two folders in $ {SAVE_PATH}. Could you send them to me? Thanks!

BartSiwek commented 2 years ago

Hi,

Sorry for the wait - it took me some time to dust off the code. I have the zip files with traces ready, but each one of them is about 70mb. Where do you want me to send them?

Additionally - I noticed there is a change in behavior depending on if I have the RIF_TRACING_ENABLE set to 1 or not in a scenario when the RIF images are backed by D3D12. Without RIF_TRACING_ENABLE set at all or with it set to 0 my code would run fine. When I set RIF_TRACING_ENABLE=1 I suddenly got an error from the DX12 debug layers stating that the transition from UAV state was invalid since that is not the resource current state. This was easy to fix on my side and it did not change the resulting image, but it seems you are doing different things under the hood depending on that flag being set or not.

givinar commented 2 years ago

Hi, We can change the backend depending on the API. This can cause different behaviors and different results. Therefore, I asked you to send traces. Could you upload traces to google drive? https://drive.google.com/drive/folders/115GuyUCUiTiJPzlrl-luOHpx5Lc1vtji?usp=sharing Thanks

BartSiwek commented 2 years ago

Hi,

I uploaded the traces.

So what you are saying is that you switch back-ends based on the value of RIF_TRACING_ENABLE? Because in my case I did not change the code. The D3D12 debug layers gave me an error if RIF_TRACING_ENABLE=1 and did not give me one when RIF_TRACING_ENABLE=0.

givinar commented 2 years ago

Hi, Thanks for the traces. I mean the backend might change depending on the API rifContextCreateImage or rifContextCreateImageFromDirectX12Memory. Enabling tracing does not affect this.

givinar commented 2 years ago

Hi @BartSiwek, I found some issues in the traces. 1) When you create an image description (rif_image_desc), you are not initializing it properly. Row pitch and slide pitch are random values. You can try something like this: _rif_image_desc inputDesc = { 0 }; inputDesc.image_width = 1962; inputDesc.image_height = 1270; inputDesc.num_components = 4; inputDesc.type = RIF_COMPONENT_TYPEFLOAT32; 2) rifSyncronizeQueue - function is redundant. 3) Make sure normals and depth are normalized. For this we use additional filters: _ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstLo", 0.0f), RIF_SUCCESS); ASSERT_EQ(rifImageFilterSetParameter1f(remapNormalsFilter, "dstHi", 1.0f), RIF_SUCCESS); ASSERTEQ(rifCommandQueueAttachImageFilter(queue, remapNormalsFilter, normalsImg, normalsImg), RIF_SUCCESS);

ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstLo", 0.0f), RIF_SUCCESS);
ASSERT_EQ(rifImageFilterSetParameter1f(remapDepthFilter, "dstHi", 1.0f), RIF_SUCCESS);
ASSERT_EQ(rifCommandQueueAttachImageFilter(queue_, remapDepthFilter, depthImg, depthImg), RIF_SUCCESS);_ 

4) And the last, the input image for "rifContextCreateImageFilter" version: image_0

The input image for "rifContextCreateImageFromDirectX12Memory" version: image_0

This also applies to normal, albedo and depth.

BartSiwek commented 2 years ago

Hi,

I will try 1) and 2) and get back to you.

Regarding the others: 3) The depth and normal's are normalized since this is how we accumulate them. Depth is naturally in the [0,1] range and normal's we remap from [-1,1] to [0,1] during accumulation. 4) Yes - I noticed that too, but image corruption seems to be coming from RIF since in PIX captures these DirectX resources look fine.

BartSiwek commented 2 years ago

Hi,

I just double checked some thing and: Regarding 2) - removing rifSyncronizeQueue did not do anything Regarding 1) - depth, row pitch and slice pitch should not be random as I fill them in (comments added for clarity):

    const auto& textureDesc = texture.getDesc();

        // For RGBA float32 getPixelFormatBitSize returns 32
    const auto rowSize = ( getPixelFormatBitSize( textureDesc.format ) * textureDesc.width ) / 8;

    rif_image_desc rifImageDesc = {};
    rifImageDesc.image_width = textureDesc.width;
    rifImageDesc.image_height = textureDesc.height;
    rifImageDesc.image_depth = textureDesc.depth;
    rifImageDesc.image_row_pitch = rowSize;
    rifImageDesc.image_slice_pitch = rowSize * textureDesc.height;

        // For RGBA Float32 returns 4
    rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );

        // For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
    rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );  

    ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );

    rif_image rifImage = nullptr;
    CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
    return rifImage;

I also checked the values in the debugger and they are correct.

givinar commented 2 years ago

Hi @BartSiwek As I can see in the trace log (DX12Backend) the row and slice values seem to be random numbers. API call: RIF_CALL(rifContextCreateImageFromDirectX12Memory((rif_context)(context_0), AsPtr((uint32_t)(1962), (uint32_t)(1270), (uint32_t)(1), (uint32_t)(31392), (uint32_t)(39867840), (uint32_t)(4), RIF_COMPONENT_TYPE_FLOAT32).get(), (void*)(0x1b3a5a49640), &image_3)); Row value is 31392 and slice is 39867840. I'm not sure if these are the correct numbers. Can you set these parameters explicitly (for example 0) and rerun the DX12 Backend with tracing again?

BartSiwek commented 2 years ago

The values certainly make sense since width * component_count * size_of_component is 1920 * 4 * 4 == 31392 And rowSize * height is 1270 * 31392 == 39867840.

That I think might be the issue is this does not account for padding introduced by DirectX. I tried to switch the code to the following:

    const auto& textureDesc = texture.getDesc();

    ID3D12Resource* d3d12res = reinterpret_cast<ID3D12Resource*>( texture.getHandle() );

    D3D12_RESOURCE_DESC d = d3d12res->GetDesc();

    D3D12_PLACED_SUBRESOURCE_FOOTPRINT footprint = {};
    UINT numRows = 0;
    UINT64 rowSize = 0;
    UINT64 totalBytes = 0;
    d3d12_device->GetCopyableFootprints( &d, 0, 1, 0, &footprint, &numRows, &rowSize, &totalBytes );

    rif_image_desc rifImageDesc = { 0 };
    rifImageDesc.image_width = textureDesc.width;
    rifImageDesc.image_height = textureDesc.height;
    rifImageDesc.image_depth = textureDesc.depth;
    rifImageDesc.image_row_pitch = footprint.Footprint.RowPitch;
    rifImageDesc.image_slice_pitch = textureDesc.height * footprint.Footprint.RowPitch;

        // For RGBA Float32 returns 4
    rifImageDesc.num_components = getComponentsForFormat( textureDesc.format );

        // For RGBA Float32 returns RIF_COMPONENT_TYPE_FLOAT32
    rifImageDesc.type = PixelFormatToRifComponentType( textureDesc.format );

    rif_image rifImage = nullptr;
    CHECK_AND_RETURN( rifContextCreateImageFromDirectX12Memory( context, &rifImageDesc, d3d12res, &rifImage ), "Error creating image", nullptr );
    return rifImage;

but that leads to the following error from RIF

[DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270)
[ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314

Which also makes sense. In this case (note the different resolution form previous examples) the row pitch which does not account for padding is 982 * 4 * 4 == 15712, while with padding it's 15872. This leads to the expected size of 9961408 without padding and 10062848 with padding.

givinar commented 2 years ago

Hi @BartSiwek, Sorry for wait. [DEBUG@RIF]: rifContextCreateImageFromDirectX12Memory called (with args: context=0x16fb405df00 image_desc={width=982 height=634 depth=1 row_pitch=15872 slice_pitch=10062848 num_components=4 type=RIF_COMPONENT_TYPE_FLOAT32} mem=0x16f0090c270) [ERROR@RIF]: GPU memory object is too small (required 10062848 bytes, allocated 9961408 bytes) at file C:\JN\WS\RadeonProImageProcessor_Build\src\Image.cpp line 314 This is the correct behaviour. The reason for this is that ID3D12Resource was allocated with a smaller size than rif_image_desc.

I still haven't been able to reproduce the error. So I prepared a simple example. To build, use CMake with options: -DRIF_INCLUDE_DIR = PATH_TO_RIF_HEADERS and -DRIF_LIBRARY_DIR = PATH_TO_RIB_LIBRARIES https://drive.google.com/file/d/1mxZlXs9i18pi8ObfVmTkuMb_ZJDyqDo-/view?usp=sharing If you can help me modify it to reproduce the error, that would be great.

BartSiwek commented 2 years ago

Hi,

I ran your sample and I hit an issue befere it did anything that could be related to what is in this bug report.

When executing a call to rifContextExecuteCommandQueue on line 268 of the sample it crashes.

The D3D Debug Output states:

D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE]

In the mean time in the console window the app opens I can see this:

[MSG  @RIF]: Loaded RIF API version: 1.7.1.0xfe53c43a
[MSG  @RIF]: Loading model: D:/Work/marmosetco.toolbag4/RifDenoiserFinal/data/render/rif-models/denoise_c9_ldr.pb
INFO: Memory: required: 1525088256, single node max: 1275854848, requested sum: 4034199552
Fatal error: cannot create pipeline state
givinar commented 2 years ago

Hi @BartSiwek, I've added a print debug message based on HRESULT for CreateComputePipelineState. https://drive.google.com/file/d/1TxaYmcNVOBdpjMarbG8HAsBb5wxVLK_T/view?usp=sharing Perhaps we can get more information. This error is not related to the RIF itself. Let's try to understand what's going on. I'm interested in ID3D12Device::CreateComputeShader I didn't find any mention of this call for DX12 only for DX11.

givinar commented 2 years ago

Hi @BartSiwek, D3D12 ERROR: ID3D12Device::CreateComputeShader: Compute Shader is unsigned. [ STATE_CREATION ERROR #322: CREATECOMPUTESHADER_INVALIDSHADERBYTECODE] Also, this error is the reason for the absence of the dxcompiler library or its incorrect version. Here is the correct version https://drive.google.com/file/d/1H1oCECq33b_tCYjGv-fp31MnEBalfcdK/view?usp=sharing

BartSiwek commented 2 years ago

Hi,

I finally got some time and I am trying your example. I was able to run it this time (even without replacing the d3d compiler dll).

This standalone app seems to not be causing the acne I shown in the example. I'll try to compare the device, queue and resource setup to figure out what could be different.

In the mean time however I do see a bit of a halo in the results: output