baldurk / renderdoc

RenderDoc is a stand-alone graphics debugging tool.
https://renderdoc.org
MIT License
9.05k stars 1.35k forks source link

Wrong data in buffer when using NVidia HSLS extensions #1100

Closed giordi91 closed 6 years ago

giordi91 commented 6 years ago

Description

To optimize a compute Shader I have been trying to use the HSLS extensions from nvidia from the NVAPI library, the problem is that render doc shows wrong data in the buffer for the part of the memory that is affected by the extension, rest of the memory seems to be fine. The way it works, is that nvidia spits out a blob of instructions in the shader, after the IR compilation, the driver will recognize the blob of instruction and replace it with a single instruction like ballot, shuffle etc. I am not sure if that trips render doc maybe in the replay of the shader?

Repro steps

I have attached a minimal program to reproduce the problem, please ignore the ugly boiler plate and have a look at the graphicclass.cpp file. In there a compute shader is compiled in a way nvidia requires and executed. Run the build executable (make sure the nvidia headers and shader is in the same folder as executable). In the CS buffer view, for the first 64 element you will see the values from 0 to 31 repeating, instead of the correct ballot value.

bufferissue

Instead if I download the buffer on cpu and print it I see the correct values: bufferissue2

Environment

Here the archive with the solution.

https://drive.google.com/file/d/1w1cfSqku_22iyQRu1B52c4qlk-gMlzY7/view?usp=sharing

Article from nvidia in how the extensions works in D3D: https://developer.nvidia.com/unlocking-gpu-intrinsics-hlsl

baldurk commented 6 years ago

RenderDoc does not support nvapi or vendor extensions on D3D in general. From the article you linked, using these requires calling NvAPI_D3D11_SetNvShaderExtnSlot to configure the extensions before actually running those shaders. That function won't be called on replay so the shader presumably won't work properly.

I do have nvapi interception set up, so I should be able to locate that function and return an error code to indicate that shader extensions are not supported while debugging.

giordi91 commented 6 years ago

Thank you for getting back to me, as far as I know, the slot needs to only be set when you compile the shader. Does RenderDoc recompile the shaders under the hood?

With the error code you want to trigger, will prevent the app to run under RenderDoc? If so I would prefer a warning at least the rest of the program can still be inspected.

baldurk commented 6 years ago

RenderDoc doesn't recompile any shaders - it can't because most of the time no source is provided. But I don't think setting the slot does anything to affect D3D compilation. AFAIK nvapi only hooks the D3D interfaces not the D3DCompiler interface. Even if it did hook D3DCompiler, that wouldn't work for anyone who compiled their shaders offline with fxc and loaded up the bytecode - since that's by far the most common workflow.

If you look at how the extension headers are defined, they basically generate sequences of instructions that contain an identifying marker (incrementing the counter on a given UAV which is an easily identifiable instruction) and then some uint writes that basically encode the actual instruction. Setting the slot tells the nvidia UMD which UAV to look out for, and also tells it to match these patterns and transform them from sort of nonsense DXBC, into the nv extension instructions.

The error code would be identical to the one you'd get if you ran your program on an AMD card for example and the feature wasn't supported at all. It wouldn't cause a crash unless your program had no fallback path, and even then it would just mean the intrinsics would not function due to no error checking.

baldurk commented 6 years ago

This should now reject nvapi calls with that commit. Note that the program you shared in the original comment of the issue will crash because it doesn't have a fallback path as I mentioned above - it bails out if nvapi functions fail, and doesn't create the necessary buffers needed for rendering later.

giordi91 commented 6 years ago

Apologies for the delay in the reply. That s what I meant earlier about a "warning" rather than an error, or even better make it a configuration, where the user can choose the behavior. Might still be valuable to be able to run the code, knowing that the part affected by NVAPI won't be accurate in the memory values, rather than having to implement a fallback layer. We currently run on fixed hardware where we know that NVAPI is available and working. Renderdoc can still be very useful even in NVAPI settings, to know if everything was bound properly etc.

baldurk commented 6 years ago

I don't want to add an option that will deliberately break capture/replay in the hopes that sometimes it won't break it too badly. This could easily lead to crashes depending on the particular use.

If you want that behaviour yourself then you can ignore the return values of nvapi entirely in your code. While running with renderdoc they won't do anything, and if you can get away with no fallback then you can still capture and replay everything else.

giordi91 commented 6 years ago

Makes total sense! Thank you for your time and explanation!

M.