Closed giordi91 closed 6 years ago
RenderDoc does not support nvapi or vendor extensions on D3D in general. From the article you linked, using these requires calling NvAPI_D3D11_SetNvShaderExtnSlot
to configure the extensions before actually running those shaders. That function won't be called on replay so the shader presumably won't work properly.
I do have nvapi interception set up, so I should be able to locate that function and return an error code to indicate that shader extensions are not supported while debugging.
Thank you for getting back to me, as far as I know, the slot needs to only be set when you compile the shader. Does RenderDoc recompile the shaders under the hood?
With the error code you want to trigger, will prevent the app to run under RenderDoc? If so I would prefer a warning at least the rest of the program can still be inspected.
RenderDoc doesn't recompile any shaders - it can't because most of the time no source is provided. But I don't think setting the slot does anything to affect D3D compilation. AFAIK nvapi only hooks the D3D interfaces not the D3DCompiler interface. Even if it did hook D3DCompiler, that wouldn't work for anyone who compiled their shaders offline with fxc and loaded up the bytecode - since that's by far the most common workflow.
If you look at how the extension headers are defined, they basically generate sequences of instructions that contain an identifying marker (incrementing the counter on a given UAV which is an easily identifiable instruction) and then some uint writes that basically encode the actual instruction. Setting the slot tells the nvidia UMD which UAV to look out for, and also tells it to match these patterns and transform them from sort of nonsense DXBC, into the nv extension instructions.
The error code would be identical to the one you'd get if you ran your program on an AMD card for example and the feature wasn't supported at all. It wouldn't cause a crash unless your program had no fallback path, and even then it would just mean the intrinsics would not function due to no error checking.
This should now reject nvapi calls with that commit. Note that the program you shared in the original comment of the issue will crash because it doesn't have a fallback path as I mentioned above - it bails out if nvapi functions fail, and doesn't create the necessary buffers needed for rendering later.
Apologies for the delay in the reply. That s what I meant earlier about a "warning" rather than an error, or even better make it a configuration, where the user can choose the behavior. Might still be valuable to be able to run the code, knowing that the part affected by NVAPI won't be accurate in the memory values, rather than having to implement a fallback layer. We currently run on fixed hardware where we know that NVAPI is available and working. Renderdoc can still be very useful even in NVAPI settings, to know if everything was bound properly etc.
I don't want to add an option that will deliberately break capture/replay in the hopes that sometimes it won't break it too badly. This could easily lead to crashes depending on the particular use.
If you want that behaviour yourself then you can ignore the return values of nvapi entirely in your code. While running with renderdoc they won't do anything, and if you can get away with no fallback then you can still capture and replay everything else.
Makes total sense! Thank you for your time and explanation!
M.
Description
To optimize a compute Shader I have been trying to use the HSLS extensions from nvidia from the NVAPI library, the problem is that render doc shows wrong data in the buffer for the part of the memory that is affected by the extension, rest of the memory seems to be fine. The way it works, is that nvidia spits out a blob of instructions in the shader, after the IR compilation, the driver will recognize the blob of instruction and replace it with a single instruction like ballot, shuffle etc. I am not sure if that trips render doc maybe in the replay of the shader?
Repro steps
I have attached a minimal program to reproduce the problem, please ignore the ugly boiler plate and have a look at the graphicclass.cpp file. In there a compute shader is compiled in a way nvidia requires and executed. Run the build executable (make sure the nvidia headers and shader is in the same folder as executable). In the CS buffer view, for the first 64 element you will see the values from 0 to 31 repeating, instead of the correct ballot value.
Instead if I download the buffer on cpu and print it I see the correct values:
Environment
Here the archive with the solution.
https://drive.google.com/file/d/1w1cfSqku_22iyQRu1B52c4qlk-gMlzY7/view?usp=sharing
Article from nvidia in how the extensions works in D3D: https://developer.nvidia.com/unlocking-gpu-intrinsics-hlsl