Geometry Shader Debugger

cugone commented 3 years ago

Description

I wrote my own game engine. It uses DXGI 1.6 (this is important later). I am writing a Rogue-like with said engine that uses the Geometry Shader to dynamically generate the tile vertices and UVs into a single tile mesh on the GPU. This allows me to generate 15x more tiles on-screen than with a CPU-generated version without a problem. Unfortunately something is wrong and I get either the wrong UVs or a big black nothing. The Visual Studio Graphical Debugger does not work with DXGI 1.6 and crashes, therefore, I cannot use it to debug my shaders. I looked up how to debug shaders with RenderDoc and found it is as simple as right-clicking in the mesh viewer. I ran RenderDoc to try and debug the Geometry Shader, only to find out that the feature is not yet implemented. Re-reading the help page confirmed this: "Geometry and Tesselation shaders are not yet debuggable."

Implementing a Geometry Shader Debugger would make addressing GS-related problems much easier.

Environment

RenderDoc version: 1.11 (built from c5480aa6) (Qt version 5.9.4) OS: Windows 10 Home version 20H2 Graphics API: Directx 11

Implement a geometry shader debugger similar to the Vertex and Pixel shader debuggers.

baldurk commented 3 years ago

Closing as duplicate of #120.

It's worth noting that I believe most recommendations are against the use of geometry shaders, in favour of compute.

cugone commented 3 years ago

Thanks for the link, I've upvoted it. :)

Correct me if I'm wrong, but the geometry shader is designed to add/remove verticies in the pipeline. That's exactly what I'm doing; that's what 2D billboarded particle emitters do; etc.

The compute shader is good for highly complex/expensive operations like per-pixel Mandlebrot calculations and would be wasted doing essentially four additions per vertex.

baldurk commented 3 years ago

Geometry shaders are inefficient especially when expansion is used like that and that's the case that GPU vendors explicitly recommend against. Billboarded particle emitters render quads of four (or N) vertices and offset them in the vertex shader, they don't use a geometry shader at all.

Compute shaders can be used for effectively anything that doesn't need to use the rasterizer for acceleration, it doesn't have to be a large amount of work.

cugone commented 3 years ago

@baldurk

Geometry shaders are inefficient especially when expansion is used like that and that's the case that GPU vendors explicitly recommend against. Billboarded particle emitters render quads of four (or N) vertices and offset them in the vertex shader, they don't use a geometry shader at all.

Playing devil's advocate, do you have a recent citation proving this? (Within the last 6 months) Modern implementations of Geometry Shaders are quite fast. I don't need to deal with RW resources, unbinding/rebinding input/output textures, nor UAV buffers with the GS.

Compute shaders can be used for effectively anything that doesn't need to use the rasterizer for acceleration, it doesn't have to be a large amount of work.

There are trade offs for each implementation. Additionally there are limits to compute shaders that don't exist for geometry shaders. Geometry shaders, as mentioned in Frank Luna's Introduction to 3D Game Programming with DirectX 11 (2012), who is referencing the GPU manufacturer documentation directly, explicitly states:

For performance purposes, maxvertexcount should be as small as possible; [NVIDIA08] states that peak performance of the GS is achieved when the GS outputs between 1-20 scalars, and performance drops to 50% if the GS outputs between 27-40 scalars....the recommendations of [NVIDIA08] are from 2008 (first generation of geometry shaders), so things should have improved.

[NVIDIA08] GPU Programming Guide GeForce 8 and 9, NVIDIA Corporation, 2008 http://developer.download.nvidia.com/GPU_Programming_Guide/GPU_Programming_Guide_G80.pdf

(His Introduction to Game Programming with DirectX 12 (2016) book's chapter on Geometry Shaders is unchanged.)

In fact, the NVIDIA GPU Programming guide explicitly states what I am doing is a good use of the GS:

(Keeping in mind, this is for GeForce 8 and 9 GPUs for DirectX 10 in 2008. The improvements since then are without a doubt much better.)

3.4.11. Geometry Shaders? Remember that geometry shaders work on primitives. This means that if you are transforming the 3 vertices for a triangle in the geometry shader then you will likely being performing a redundant transformation on the same vertex for every primitive that shares it. Only use the GS on when you really need it. (See section 3.4.14) for more information about GS performance.

3.4.14. Too many generated primitives in Geometry Shader Geometry Shaders have the ability to output new primitives generated procedurally in the shader. Be careful to use this feature judiciously as on all current generation hardware the performance of the geometry shader is directly proportional to the number of output attributes. In general outputting the same number as input or a few more is acceptable, but 10x the number of input primitives will start to slow the shader down to the point where it will become the bottleneck. See section 4.6 for more information.

4.6. Geometry Shader ... Thus, it is important to understand that the main use of the geometry shader is NOT for doing heavy output algorithms such as tessellation.In addition, because a GS runs on primitives, per-vertex operations will be duplicated for all primitives that share a vertex. This is potentially a waste of processing power. A geometry shader is most useful when doing operations on small vertices or primitive data that requires outputting only small amounts of new data. But in general, the potential for wasted work and performance penalties for using a GS makes it an often unused feature of Shader model 4.

Regardless of the above quotes, The next section states explicitly that point sprites (i.e. point-quad expansion) is a perfectly acceptable use of the geometry shader:

4.6.2 A decent use of Geometry Shaders: Point Sprites In contrast to all the performance hazards of using the GS, one case generally will run very well, and be simple to implement. This case is point sprites. Given that the point sprite fixed function capability has been removed in DirectX 10, you can now simply generate a primitive from a single input vertex. This has the benefit of generally reducing vertex setup attribute boundedness for that batch, as well the generated vertices will generally be small in size and thus the performance of the GS will stay fast. But be mindful of the number of output scalar attributes. You can still easily hit the performance cliff if you make your vertices large. See the previous example.

baldurk commented 3 years ago

I'm not here to convince you of anything, you can do what you like. I just wanted to let you know that geometry shaders have been virtually abandoned, since with them still being in the API not everyone is aware of that.

cugone commented 3 years ago

@baldurk Fair enough. :)

baldurk / renderdoc

Geometry Shader Debugger #2110

Description

Environment