sholloway / agents-playground

MIT License
4 stars 0 forks source link

GPU Debugging Pipeline #110

Open sholloway opened 1 year ago

sholloway commented 1 year ago

The Challenge

Establish a methodology for debugging GPU calls.

Summary

WGSL is transpiled into Metal calls. I need a way to see the GPU calls and inspect what shaders are doing.

sholloway commented 1 year ago

Investigation

This video gives an overview of how GPU debugging can be done with XCode.

There are two tools at play.

You can use XCode to do a GPU Frame capture. You then break down the scene being rendered into a frame graph. The frame graph is a break down of the order and purpose of each rendering pass. An example of a frame graph is:

  1. G-Buffer
  2. SSAO
  3. CSM
  4. Lightning
  5. Post Processing
  6. Composition

A GPU Trace enables seeing all the render passes.

The Instruments tool has profiling templates. use the Game Performance or Metal System Trace to trace the GPU. The Metal System Trace Instruments template provides a frame level inspection tool.

Resources

sholloway commented 1 year ago

Instrument a Running Python App


Here are the steps to run an Instrument Trace on a Python WebGPU app.

Steps

  1. Launch the Instruments app. I think this is installed with XCode.
  2. Select Metal System Trace as the template.
  3. Launch the Python Script.
  4. In the Instruments projects, choose the running python process to attach to.
  5. Click the record button.
  6. In the app do what ever you're trying to trace.
  7. Click the record button to stop recording.

You should now have a trace to inspect.

sholloway commented 1 year ago

Setup XCode to Debug a Python Project


Here are the steps to configure XCode to debug a Python project.

A few things to note:

Steps

  1. Install XCode.
  2. Launch XCode and create a new project. Select External Build System as the project type.
  3. Set the build tool location to be the Python interpreter. (Need to see how this plays with a venv.)
  4. Edit the Run Scheme to use Python
    • Open the scheme editor by selecting your project in the Project Navigator.
    • In the top level menu bar, go to Product -> Scheme -> Edit Scheme.
  5. In the scheme editor dialog:
    • Select the Run scheme from the left-hand pane.
    • Choose the Info tab in the right-hand pane.
    • In the Executable dropdown menu, select Other.
    • We need to set this to be the Python used by the venv. However venv is a hidden directory. To see it in XCode, navigate up one directory and press Shift + Alt + .
    • Select the poetry executable since we use that to actually launch things.
  6. Still in the Scheme Editor Dialog...
    • Switch to the Arguments tab in the scheme editor.
    • In the Arguments Passed On Launch section, click the + button to add a new argument.
    • Enter the relative path (from the root of the Xcode project) to the Python file that you will be running, including the .py extension.
  7. Switch to the Option tab in the Scheme Editor Dialog...
    • Enable the check box Use Custom Working Directory.
    • Set the working directory to the root of the code base (i.e. agent-playground directory).
sholloway commented 1 year ago

Blocker


I can launch and start a debugging session of the POC/Obj Loader however XCode will not trace the Metal commands.

Error Message

The below message is displayed in the terminal output during the Launch target.

[Metal Diagnostics Warning] Application Deployment Target Version (11.0) does not match OS Version (13.5.2) - diagnostics may be missing debug

I've gone round and round trying to set the deployment target settings. I think this is actually baked into the python executable. The below shell snippet outputs that Python was built targeting v11 of macOS and was linked against the v11 SDKs.

otool -l /nix/store/dbmhpvp80aqxlasa8d6a7b5id1ijsz6g-python3-3.11.2-env/bin/python

Load command 0 cmd LC_SEGMENT_64 cmdsize 72 segname PAGEZERO vmaddr 0x0000000000000000 vmsize 0x0000000100000000 fileoff 0 filesize 0 maxprot 0x00000000 initprot 0x00000000 nsects 0 flags 0x0 Load command 1 cmd LC_SEGMENT_64 cmdsize 472 segname TEXT vmaddr 0x0000000100000000 vmsize 0x0000000000004000 fileoff 0 filesize 16384 maxprot 0x00000005 initprot 0x00000005 nsects 5 flags 0x0 Section sectname text segname TEXT addr 0x0000000100003a6c size 0x000000000000005c offset 14956 align 2^2 (4) reloff 0 nreloc 0 flags 0x80000400 reserved1 0 reserved2 0 Section sectname stubs segname TEXT addr 0x0000000100003ac8 size 0x0000000000000018 offset 15048 align 2^2 (4) reloff 0 nreloc 0 flags 0x80000408 reserved1 0 (index into indirect symbol table) reserved2 12 (size of stubs) Section sectname stub_helper segname TEXT addr 0x0000000100003ae0 size 0x0000000000000030 offset 15072 align 2^2 (4) reloff 0 nreloc 0 flags 0x80000400 reserved1 0 reserved2 0 Section sectname cstring segname TEXT addr 0x0000000100003b10 size 0x00000000000004a5 offset 15120 align 2^0 (1) reloff 0 nreloc 0 flags 0x00000002 reserved1 0 reserved2 0 Section sectname unwind_info segname TEXT addr 0x0000000100003fb8 size 0x0000000000000048 offset 16312 align 2^2 (4) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Load command 2 cmd LC_SEGMENT_64 cmdsize 152 segname DATA_CONST vmaddr 0x0000000100004000 vmsize 0x0000000000004000 fileoff 16384 filesize 16384 maxprot 0x00000003 initprot 0x00000003 nsects 1 flags 0x10 Section sectname got segname DATA_CONST addr 0x0000000100004000 size 0x0000000000000008 offset 16384 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000006 reserved1 2 (index into indirect symbol table) reserved2 0 Load command 3 cmd LC_SEGMENT_64 cmdsize 232 segname DATA vmaddr 0x0000000100008000 vmsize 0x0000000000004000 fileoff 32768 filesize 16384 maxprot 0x00000003 initprot 0x00000003 nsects 2 flags 0x0 Section sectname la_symbol_ptr segname DATA addr 0x0000000100008000 size 0x0000000000000010 offset 32768 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000007 reserved1 3 (index into indirect symbol table) reserved2 0 Section sectname data segname DATA addr 0x0000000100008010 size 0x0000000000000010 offset 32784 align 2^3 (8) reloff 0 nreloc 0 flags 0x00000000 reserved1 0 reserved2 0 Load command 4 cmd LC_SEGMENT_64 cmdsize 72 segname __LINKEDIT vmaddr 0x000000010000c000 vmsize 0x0000000000004000 fileoff 49152 filesize 2032 maxprot 0x00000001 initprot 0x00000001 nsects 0 flags 0x0 Load command 5 cmd LC_DYLD_INFO_ONLY cmdsize 48 rebase_off 49152 rebase_size 8 bind_off 49160 bind_size 24 weak_bind_off 0 weak_bind_size 0 lazy_bind_off 49184 lazy_bind_size 32 export_off 49216 export_size 64 Load command 6 cmd LC_SYMTAB cmdsize 24 symoff 49288 nsyms 7 stroff 49424 strsize 88 Load command 7 cmd LC_DYSYMTAB cmdsize 80 ilocalsym 0 nlocalsym 1 iextdefsym 1 nextdefsym 3 iundefsym 4 nundefsym 3 tocoff 0 ntoc 0 modtaboff 0 nmodtab 0 extrefsymoff 0 nextrefsyms 0 indirectsymoff 49400 nindirectsyms 5 extreloff 0 nextrel 0 locreloff 0 nlocrel 0 Load command 8 cmd LC_LOAD_DYLINKER cmdsize 32 name /usr/lib/dyld (offset 12) Load command 9 cmd LC_BUILD_VERSION cmdsize 32 platform 1 minos 11.0 sdk 11.0 ntools 1 tool 3 version 609.0 Load command 10 cmd LC_SOURCE_VERSION cmdsize 16 version 0.0 Load command 11 cmd LC_MAIN cmdsize 24 entryoff 14956 stacksize 0 Load command 12 cmd LC_LOAD_DYLIB cmdsize 56 name /usr/lib/libSystem.B.dylib (offset 24) time stamp 2 Wed Dec 31 18:00:02 1969 current version 1292.60.1 compatibility version 1.0.0 Load command 13 cmd LC_RPATH cmdsize 120 path /nix/store/ylxc5aq56jqd19vmbqgpgbyjnjmw9qyd-apple-framework-CoreFoundation-11.0.0/Library/Frameworks (offset 12) Load command 14 cmd LC_FUNCTION_STARTS cmdsize 16 dataoff 49280 datasize 8 Load command 15 cmd LC_DATA_IN_CODE cmdsize 16 dataoff 49288 datasize 0 Load command 16 cmd LC_CODE_SIGNATURE cmdsize 16 dataoff 49520 datasize 1664

sholloway commented 1 year ago

XCode's Metal Debugger is not going to work for my needs. The Python executable targets v11 of macOS while the debugger requires 13.5.2. I cannot find a way around this without compiling Python myself. I really don't want to have to fool with that.

Renderdoc has a branch that is working towards macOS support. It may be worth compiling that.

sholloway commented 1 year ago

Manual Shader Debugging Techniques

Without proper tool support I'm resorting to attempting to build a shader pipeline for rapid debugging.

Challenge

Define a ShaderDebugger class that provides an API for easily building a render pipeline that allows the option of:

It would be ideal to have a single pipeline that can handle the three use cases. So if the debugger wants to, it could render vertices and edges and faces all in a single frame.

Vert Primitive Notes

The actual vert-list primitive type doesn't enable setting the size of vertices. They're only one pixel wide. That said, perhaps I could dynamically create a quad at the vertex location. Geometry shaders are not possible in a WebGPU pipeline. (FYI, they're not supported by Metal and have developed a reputation for being a performance bottleneck.)

The classic work around is using instancing + CPU or Computer Shader. So creating a mesh that represents a vertex and then creating an instance of that at every vertex position is probably the correct approach. See the gpuweb isues 1239 and 332 for a discussions about this.

Shader Notes

How does one enable using multiple shaders in a single render frame?

Related Resources

sholloway commented 11 months ago

Barycentric Coordinates


I've got drawing edges working with line lists. Now try to incorporate using barycentric coordinates to control rendering at a pixel level.

The Basic concept is to expand the vertex buffer to include barycentric coordinates for each vertex. In the vertex shader pass it to the fragment shader using the @interpolate annotation. In the fragment shader have logic that determines to draw a line fragment or face fragment based on it's BC location.

Tasks

Related Resources