GPUOpen-Tools / GPU-Reshape

GPU Reshape (GRS) is an API & vendor agnostic instrumentation framework, with instruction level validation.
Other
374 stars 12 forks source link

No rendering at all when the app is launched with Reshape #49

Closed Cyphall closed 8 months ago

Cyphall commented 8 months ago

Hi,

When I run my engine with Reshape, there is no rendering at all inside the window. It is completely black. Also, my app freezes when trying to close the window.

Everything works fine if it is ran directly without Reshape.

I tried forcing a standard 8-bit swapchain format instead of the 10-bit one but that did not change anything.

You can find it at https://github.com/Cyphall/Cyph3D. Note that the Debug configuration requires Validation Layers to be available. Since Reshape does not expose them, only the Release configuration is usable with it.

miguel-petersen commented 8 months ago

Hi Cyphall,

Thanks for the repro case! During development we've fixed numerous issues like that, so I'm sure we can find a fix for you. 🙂

It should be possible to enable the Vulkan validation layers, how are you typically enabling them?

Before I get a chance to head into the sources, is there anything particular about your rendering setup? Such as presenting from async compute, bindless vertex fetching?

Cyphall commented 8 months ago

It should be possible to enable the Vulkan validation layers, how are you typically enabling them?

In the Debug configuration, I enable them through code by passing "VK_LAYER_KHRONOS_validation" to VkInstanceCreateInfo::ppEnabledLayerNames, but I first check if they are available by checking if one of the layers returned by vkEnumerateInstanceLayerProperties corresponds to this identifier. If not, I exit with an error message (this is what I got with Reshape).

Before I get a chance to head into the sources, is there anything particular about your rendering setup? Such as presenting from async compute, bindless vertex fetching?

This is mostly a standard setup, with some exceptions:

miguel-petersen commented 8 months ago

Deleted the message, it was incorrect. I am taking a look at your issue.

Cyphall commented 8 months ago

Now that you mentionned it, I forgot to add that I'm using Dynamic Rendering for all my render passes (with MSAA x4 for the viewport window).

miguel-petersen commented 8 months ago

A hook for vkQueueSubmit2 was missing, currently dealing with a crash in the descriptor management.

miguel-petersen commented 8 months ago

Oh, forgot to tag the CL. I was indexing variable length descriptors wrong. https://github.com/GPUOpen-Tools/GPU-Reshape/commit/9beb7f3f8876c663edf0da684c8db6980e7d202e

Seems something with push descriptors isn't working, dealing with that now.

miguel-petersen commented 8 months ago

So, seems the push descriptor issue only happens with the validation layers enabled. If I disable them, it seems to run and instrument just fine. To test it, I purposefully read outside the bounds of a texture (see the underlined ivec2(24, 24) below.

Cool engine btw!

image

(It's not reading back the resource information correctly though, will need to investigate)

Cyphall commented 8 months ago

Cool engine btw!

Thank you! I'm happy to help with a new test case for your tool 😄

miguel-petersen commented 8 months ago

Thanks!

Would it be possible to build the branch issue/49-vkQueueSubmit2 from source? Very curious if it fixes your crashes.

Cyphall commented 8 months ago

I'm having issues building the branch:

[...]
C:\Projects\CPP\GPU-Reshape\bin\ThinX86>setlocal

C:\Projects\CPP\GPU-Reshape\bin\ThinX86>call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars32.bat"
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.8.5
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x86'
CMake Warning (dev) at CMakeLists.txt:27 (project):
  cmake_minimum_required() should be called prior to this top-level project()
  call.  Please see the cmake-commands(7) manual for usage documentation of
  both commands.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- The C compiler identification is MSVC 19.38.33134.0
-- The CXX compiler identification is MSVC 19.38.33134.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - failed
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe
-- Check for working C compiler: C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe - broken
CMake Error at C:/Program Files/Microsoft Visual Studio/2022/Community/Common7/IDE/CommonExtensions/Microsoft/CMake/CMake/share/cmake-3.27/Modules/CMakeTestCCompiler.cmake:67 (message):
  The C compiler

    "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.38.33130/bin/Hostx64/x64/cl.exe"

  is not able to compile a simple test program.

  It fails with the following output:

    Change Dir: 'C:/Projects/CPP/GPU-Reshape/bin/ThinX86/CMakeFiles/CMakeScratch/TryCompile-3tyajp'

    Run Build Command(s): C:/PROGRA~1/MICROS~2/2022/COMMUN~1/Common7/IDE/COMMON~1/MICROS~1/CMake/Ninja/ninja.exe -v cmTC_1804f
    [1/2] C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1438~1.331\bin\Hostx64\x64\cl.exe  /nologo   -m32  /DWIN32 /D_WINDOWS /W3  /MDd /Zi /Ob0 /Od /RTC1 /showIncludes /FoCMakeFiles\cmTC_1804f.dir\testCCompiler.c.obj /FdCMakeFiles\cmTC_1804f.dir\ /FS -c C:\Projects\CPP\GPU-Reshape\bin\ThinX86\CMakeFiles\CMakeScratch\TryCompile-3tyajp\testCCompiler.c
    cl : Command line warning D9002 : ignoring unknown option '-m32'
    [2/2] cmd.exe /C "cd . && "C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\cmTC_1804f.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x86\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x86\mt.exe --manifests  -- C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1438~1.331\bin\Hostx64\x64\link.exe /nologo CMakeFiles\cmTC_1804f.dir\testCCompiler.c.obj  /out:cmTC_1804f.exe /implib:cmTC_1804f.lib /pdb:cmTC_1804f.pdb /version:0.0 /machine:x64  /debug /INCREMENTAL /subsystem:console  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
    FAILED: cmTC_1804f.exe
    cmd.exe /C "cd . && "C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -E vs_link_exe --intdir=CMakeFiles\cmTC_1804f.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x86\rc.exe --mt=C:\PROGRA~2\WI3CF2~1\10\bin\100226~1.0\x86\mt.exe --manifests  -- C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1438~1.331\bin\Hostx64\x64\link.exe /nologo CMakeFiles\cmTC_1804f.dir\testCCompiler.c.obj  /out:cmTC_1804f.exe /implib:cmTC_1804f.lib /pdb:cmTC_1804f.pdb /version:0.0 /machine:x64  /debug /INCREMENTAL /subsystem:console  kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib && cd ."
    LINK Pass 1: command "C:\PROGRA~1\MICROS~2\2022\COMMUN~1\VC\Tools\MSVC\1438~1.331\bin\Hostx64\x64\link.exe /nologo CMakeFiles\cmTC_1804f.dir\testCCompiler.c.obj /out:cmTC_1804f.exe /implib:cmTC_1804f.lib /pdb:cmTC_1804f.pdb /version:0.0 /machine:x64 /debug /INCREMENTAL /subsystem:console kernel32.lib user32.lib gdi32.lib winspool.lib shell32.lib ole32.lib oleaut32.lib uuid.lib comdlg32.lib advapi32.lib /MANIFEST /MANIFESTFILE:CMakeFiles\cmTC_1804f.dir/intermediate.manifest CMakeFiles\cmTC_1804f.dir/manifest.res" failed (exit code 1120) with the following output:
    testCCompiler.c.obj : error LNK2001: unresolved external symbol _RTC_InitBase
    testCCompiler.c.obj : error LNK2001: unresolved external symbol _RTC_Shutdown
    LINK : error LNK2001: unresolved external symbol mainCRTStartup
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\kernel32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\user32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\gdi32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\winspool.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\shell32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\ole32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\oleaut32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\uuid.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\comdlg32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\advapi32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\MSVCRTD.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    cmTC_1804f.exe : fatal error LNK1120: 3 unresolved externals
    ninja: build stopped: subcommand failed.
[...]

From a quick lookup, it seems that at some point you are resetting the command prompt with the 32-bit version of the Visual Studio bat file

C:\Projects\CPP\GPU-Reshape\bin\ThinX86>call "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Auxiliary\Build\vcvars32.bat"
**********************************************************************
** Visual Studio 2022 Developer Command Prompt v17.8.5
** Copyright (c) 2022 Microsoft Corporation
**********************************************************************
[vcvarsall.bat] Environment initialized for: 'x86'

which seems to break the compilation process as everything until this point was setup and built for 64-bit

    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\kernel32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\user32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\gdi32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\winspool.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\shell32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\ole32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\oleaut32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\uuid.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\comdlg32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files (x86)\Windows Kits\10\\lib\10.0.22621.0\\um\x86\advapi32.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
    C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.38.33130\lib\x86\MSVCRTD.lib : warning LNK4272: library machine type 'x86' conflicts with target machine type 'x64'
miguel-petersen commented 8 months ago

Strange. For the time being, could you configure cmake with "-DENABLE_X86_BOOTSTRAPPER=OFF", which disables the 32 bit bootstrapper for DX12.

Cyphall commented 8 months ago

I was initially trying to build with CMake CLI only, without using the VisualStudio2022.bat file. Now that I use it and build with Visual Studio, it is building correctly.

My engine seems to be working now, for the most part:

miguel-petersen commented 8 months ago

I see, I'll see if I can figure out what's happening with building from cmake.

On the initialization errors, some false positives is known with that feature, though it passes for most our samples. Could you describe how you initialize your, for example, material texture resources?

On the latter, I'll investigate.

Cyphall commented 8 months ago

The texture assets are loaded in this function.

It is a pretty standard texture loading code:

  1. Create destination image
  2. Create device-local & host-visible staging buffer
  3. Copy texture data to the staging buffer
  4. Copy the staging buffer to the texture using the transfer queue
  5. Transfer ownership of the image from the transfer queue to the graphics queue

After that, the image descriptor is copied to the BindlessTextureManager This class manages a "rotating" descriptor set (basically an array of N identical descriptor sets with N = the number of frames in flight). Each change (addition or deletion of textures) is instantly applied to the current descriptor set and backported to the other ones when their corresponding frame start, to make sure we don't modify a descriptor set that may currently be used by the GPU.

miguel-petersen commented 8 months ago

Hmm, everything in that functions should be tracked. I haven't tried reproducing just yet, but will either today or tomorrow.

Just to confirm, are you enabling Initialization checking from launch? Or after attaching? The latter, which should produce a warning dialog, may produce false positives.

Cyphall commented 8 months ago

Yes I did test with every check enabled

miguel-petersen commented 8 months ago

Oh sorry, I meant that there are two ways to instrument from Reshape. Either the application is launched from the toolkit, or you attach after launching your application (with discovery enabled) and then instrument.

Which of the two are you using?

Cyphall commented 8 months ago

Ah sorry, I misread your question... :sweat_smile:

I'm launching the app directly through Reshape.

miguel-petersen commented 8 months ago

Gotcha, I'll reproduce on my end and see what's happening 🙂

miguel-petersen commented 8 months ago

Fixed some issues with resolve targets in: https://github.com/GPUOpen-Tools/GPU-Reshape/commit/ca18dd39eb87d3682f0f01e02c20e9a46a5cda2e

Currently tracking down some issues with descriptor management, I think it's on my side.

Cyphall commented 8 months ago

I still get an "Uninitialized resource read" in exposure.frag.

I tried to see what could cause this issue, and I think it may be caused by the fact that the dynamic render pass in my skybox pass, which is responsible for the color resolve, has no draw call when no skybox is loaded. In this case, I have an "empty" render pass, but which still perform the resolve operation.

Also, while investigating this, I found another graphical error. The HDR skyboxes processed by the EquirectangularSkyboxProcessor only have one face, all others are black. The issue is at processing time and not at upload time, since launching my app directly but with the same asset cache generated with Reshape still show only one face. The issue probably happens somewhere in EquirectangularSkyboxProcessor::generateCubemap, which converts the equirectangular image into a cubemap.

miguel-petersen commented 8 months ago

The uninitialized error is somehow related to push descriptor usage, trying to track down what's going wrong. The resolve operation should mark the resource as initialized, but there's some internal state management going wrong when reading, causing it read the wrong metadata.

I'll take a look at the cubemap issue!

miguel-petersen commented 8 months ago

Fixed the push descriptor issue, swapped dest / source operands on a memcpy. Working on an imgui related initialization error.

miguel-petersen commented 8 months ago

Was missing a couple of hooks. I no longer get initialization errors on my end.

Next up is the cubemap issue, then the path tracing issue.

miguel-petersen commented 8 months ago

On the cubemap issue, could you describe how you confirm the issue is present?

With Reshape attached, and a cleared /cache directory, I don't see issues in the background cubemap.

Cyphall commented 8 months ago

When you start the engine, you get an empty scene. From there, you need to load an equirectangular HDR skybox (you can use "desert.c3dskybox" located in "/skyboxes/desert/"). The two other default skyboxes ("space" and "space2") wont show the issue, as they are classic 6-sided SDR skyboxes. Simply drag & drop this file into the "Skybox" field of the top-right window.

miguel-petersen commented 8 months ago

On my side I believe the cubemap is generated correctly, strange. What card and driver are you on?

image

Cyphall commented 8 months ago

I'm on a Nvidia GeForce RTX 3070 with driver 546.65

This is how it appears on my end: image Only the face on the "right" (X+) is generated. (I slightly rotated the camera to the right otherwise the face is not visible)

Also, here are the settings I use: image

miguel-petersen commented 8 months ago

I tried reproducing with the All workspace and everything enabled, the faces still look fine.

Right now I'm on an AMD card, but I'll switch and see if it's something specific to NV cards.

Cyphall commented 8 months ago

I pulled your two commits from yesterday and no longer see the issue, so it must have been solved in one of them!

miguel-petersen commented 8 months ago

Oh! I'm not really sure how, but happy that it did!

miguel-petersen commented 8 months ago

Couple of fixes, path tracing works on my end now.

image

Cyphall commented 8 months ago

It works on my end too, I think that was the last critical issue!

The only issues left are a few false positives:

miguel-petersen commented 8 months ago

The first issue is "expected", as all raytracing is currently pass through. If any resource is initialized in a raytracing shader, Reshape wont have visibility on it just yet. Raytracing support (with mesh shaders) is coming of course!

On the latter, yes, currently concurrency doesn't track at a per-subresource level. So it incorrectly marks it as a race condition. Will use this issue to track it.

Cyphall commented 8 months ago

Thank you for all the work done fixing these issues 😃

miguel-petersen commented 8 months ago

Ignore those commits, dealing with some rebasing shenanigan's.

miguel-petersen commented 8 months ago

@Cyphall On second thought, would you mind opening a separate issue for the concurrency false positive?

I'd like to close this one as it's become quite big. If it works for you.

Cyphall commented 8 months ago

Done, you can close this issue when the commits are merged.

miguel-petersen commented 8 months ago

Thanks!

All changes have been merged to the development branch. It'll be merged to mainline on the next release. https://github.com/GPUOpen-Tools/GPU-Reshape/tree/development