LunarG / gfxreconstruct

Graphics API Capture and Replay Tools for Reconstructing Graphics Application Behavior
https://vulkan.lunarg.com/doc/sdk/latest/linux/capture_tools.html
MIT License
413 stars 122 forks source link

feature request: a dedicated frame trimming tool #1109

Open zmike opened 1 year ago

zmike commented 1 year ago

gfxreconstruct has the ability to capture a selection of frames while capturing. What it really needs is the ability to trim existing captures so that e.g., user-submitted traces that come with bug reports can be pruned to a frame that actually displays the issue. apitrace has this in the form of gltrim.

panos-lunarg commented 1 year ago

You can re-capture while replaying a trace, but this time specifying the desired frame range you want to focus on. Do you think that this should achieve what you describe?

zmike commented 1 year ago

I actually tried this, but adding gfxr into the loader during replay ended up breaking the replay and it wouldn't start up. I can get more info in a bit.

It's still a bit clunky compared to having a utility specifically for this purpose, especially considering most people are used to the facilities provided by apitrace in this area.

panos-lunarg commented 1 year ago

I actually tried this, but adding gfxr into the loader during replay ended up breaking the replay and it wouldn't start up. I can get more info in a bit.

IIRC we exercise this in our CI so it should work. If it doesn't then maybe we have a bug somewhere

bradgrantham-lunarg commented 1 year ago

@zmike , thanks for the feature request!

We use "recapture" (replay with trimmed capture enabled) frequently to trim to frame or frame ranges. We also have test cases in our internal CI. If it didn't work for you, there may be options required or it may just be a bug. Please let us know what the details are.

It's true that requiring a live Vulkan device is potentially more awkward than an offline trimming tool but the priority of an offline trimming tool is not high on our list.

On desktop this should be relatively simple and to trim to frame M it should look something like

gfxrecon-capture-vulkan.py --capture-file trimmed-capture.gfxr --capture-frames M gfxrecon-replay your-original-capture.gfxr

(Of course replace those .gfxr files with names of your choosing.) You may also find that running gfxrecon.py optimize trimmed-capture.gfxr trimmed-optimized.gfxr reduces file size and replay time for the resulting trimmed capture file.

Are you capturing on Android? I agree that's more awkward and we could probably improve that use case.

zmike commented 1 year ago

I've just re-tested here (6cc3efc5d7e3ed7104769821689b76bdf613b98a) on my desktop with RADV, and the result is the same: this functionality doesn't appear to work.

I have two cases. In the first, I captured (Dota2) normally. Upon trying to do a recapture trim, I get this output:

$ gfxrecon-capture-vulkan.py --capture-file trimmed-capture.gfxr --capture-frames 6001 gfxrecon-replay ./bin/linuxsteamrt64/gfxrecon_capture_20230501T120723.gfxr
Executing program /usr/local/bin/gfxrecon-replay
Errors:

Output:
 [gfxrecon] WARNING - Replay tool has detected that the capture layer is enabled
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FILE" with value "/home/zmike/.local/share/Steam/steamapps/common/dota 2 beta/game/trimmed-capture.gfxr"
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FRAMES" with value "6001"
[gfxrecon] INFO - Initializing GFXReconstruct capture layer
[gfxrecon] INFO -   GFXReconstruct Version 0.9.20-dev (dev:6cc3efc*)
[gfxrecon] INFO -   Vulkan Header Version 1.3.246
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - Image bound to device memory at an offset which is not page aligned. Corruption might occur. In that case set Page Guard Align Buffer Sizes env variable to true.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.
[gfxrecon] WARNING - API call vkCreateGraphicsPipelines returned value VK_SUCCESS that does not match return value from capture file: VK_PIPELINE_COMPILE_REQUIRED.

This looks to me like https://github.com/LunarG/gfxreconstruct/issues/1080. Not a problem, but for whatever reason my trace is not trimmed.

Case 2: I disable shader caching in my driver for the initial capture and replay to ensure that there's no mismatch in shader compile vs cache returns.

$ gfxrecon-capture-vulkan.py --capture-file trimmed-capture.gfxr --capture-frames 501 gfxrecon-replay ./bin/linuxsteamrt64/gfxrecon_capture_20230501T120933.gfxr
Executing program /usr/local/bin/gfxrecon-replay
Errors:

Output:
 [gfxrecon] WARNING - Replay tool has detected that the capture layer is enabled
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FILE" with value "/home/zmike/.local/share/Steam/steamapps/common/dota 2 beta/game/trimmed-capture.gfxr"
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FRAMES" with value "501"
[gfxrecon] INFO - Initializing GFXReconstruct capture layer
[gfxrecon] INFO -   GFXReconstruct Version 0.9.20-dev (dev:6cc3efc*)
[gfxrecon] INFO -   Vulkan Header Version 1.3.246
[gfxrecon] WARNING - Image bound to device memory at an offset which is not page aligned. Corruption might occur. In that case set Page Guard Align Buffer Sizes env variable to true.

No errors are printed, and again my trace is not trimmed. Am I missing something?

bradgrantham-lunarg commented 1 year ago

@zmike when you say "my trace is not trimmed", can you give more detail? Is it empty? Does it not replay? What does gfxrecon.py info /home/zmike/.local/share/Steam/steamapps/common/dota 2 beta/game/trimmed-capture.gfxr output?

zmike commented 1 year ago

Sorry, I should have been more clear: no output file is created.

bradgrantham-lunarg commented 1 year ago

Do you get a seg fault or anything? It looks like capture never starts writing the trimmed capture. Here's the output on my machine for a capture of vkcube. Note that it says "Recording graphics API capture" and then "Finished recording graphics capture" (like bare capture of a Vulkan app would say)

[gfxrecon] WARNING - Replay tool has detected that the capture layer is enabled
[gfxrecon] WARNING - Skipping unrecognized meta-data block with type 18
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FILE" with value "F:\gfxreconstruct\trimmed-capture.gfxr"
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FRAMES" with value "1000"
[gfxrecon] INFO - Initializing GFXReconstruct capture layer
[gfxrecon] INFO -   GFXReconstruct Version 0.9.20-dev (test-SDK-integration:3a665ea)
[gfxrecon] INFO -   Vulkan Header Version 1.3.246
[gfxrecon] INFO - Recording graphics API capture to F:\gfxreconstruct\trimmed-capture_frame_1000_20230501T092429.gfxr
[gfxrecon] INFO - Finished recording graphics API capture
Total time: 0.805279 seconds
Replay FPS: 3123.141172 fps, 0.805279 seconds, 2515 frames, framerange 1-2515
Executing program C:\VulkanSDK\1.3.236.0\Bin\gfxrecon-replay.EXE
panos-lunarg commented 1 year ago

[gfxrecon] WARNING - Image bound to device memory at an offset which is not page aligned. Corruption might occur. In that case set Page Guard Align Buffer Sizes env variable to true.

Can you also set GFXRECON_PAGE_GUARD_ALIGN_BUFFER_SIZES to true? Don't think that's the problem here but just in case.

zmike commented 1 year ago

Do you get a seg fault or anything?

It doesn't appear to be crashing, no.

Can you also set GFXRECON_PAGE_GUARD_ALIGN_BUFFER_SIZES to true?

I've tried this as well but it seems to have no effect.

The trace replays fine when I'm not attempting to recapture. Has this ever been tested on linux?

bradgrantham-lunarg commented 1 year ago

Yes, it is tested under Linux. Could you send the output of gfxrecon.py info of your original capture?

zmike commented 1 year ago
Exe info:
    Application exe name: 
    Application version: 0.0.0.0
    Application Company name: 
    Product name: 
File info:
    Compression format: LZ4
    Total frames: 845

Application info:
    Application name: dota
    Application version: 1
    Engine name: Source2
    Engine version: 1
    Target API version: 4198400 (1.1.0)

Physical device info:
    Device name: AMD Radeon RX 5700 XT (RADV NAVI10)
    Device ID: 0x731f
    Vendor ID: 0x1002
    Driver version: 96473187 (0x5c01063)
    API version: 4206838 (1.3.246)

Device memory allocation info:
    Total allocations: 296
    Min allocation size: 1048576
    Max allocation size: 12582928

Pipeline info:
    Total graphics pipelines: 27243
    Total compute pipelines: 7
bradgrantham-lunarg commented 1 year ago

And that's the capture file you tried to replay with capture with trimming with --frames 501 or with --frames 6001?

Could you try the recapture with --frames 1 (that should succeed) and send the output of that?

zmike commented 1 year ago

Ah, I could've been more clear: that output was from the trace I used with --frames 501.

--frames 1 does indeed work:

Executing program /usr/local/bin/gfxrecon-replay
Output:
 [gfxrecon] WARNING - Replay tool has detected that the capture layer is enabled
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FILE" with value "/home/zmike/.local/share/Steam/steamapps/common/dota 2 beta/game/trimmed-capture.gfxr"
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_CAPTURE_FRAMES" with value "1"
[gfxrecon] INFO - Settings Loader: Found option "GFXRECON_PAGE_GUARD_ALIGN_BUFFER_SIZES" with value "true"
[gfxrecon] INFO - Initializing GFXReconstruct capture layer
[gfxrecon] INFO -   GFXReconstruct Version 0.9.20-dev (dev:6cc3efc*)
[gfxrecon] INFO - Recording graphics API capture to /home/zmike/.local/share/Steam/steamapps/common/dota 2 beta/game/trimmed-capture_frame_1_20230501T130056.gfxr
[gfxrecon] INFO -   Vulkan Header Version 1.3.246
[gfxrecon] INFO - Finished recording graphics API capture
Total time: 43.579288 seconds
Replay FPS: 19.389945 fps, 43.579288 seconds, 845 frames, framerange 1-845
[gfxrecon] WARNING - Leaked 61 VkImageView objects allocated from VkDevice ID 5 on exit
[gfxrecon] WARNING - Leaked 31 VkImage objects allocated from VkDevice ID 5 on exit

It seems any value greater than 1 does not work, however.

bradgrantham-lunarg commented 1 year ago

@zmike, based on our private conversation I think this was resolved by #1113 . Could you let me know if this issue is resolved for you, please?

zmike commented 1 year ago

Hm it wasn't really resolved though? While it's true I can replay a trace through gfxr and capture again, I still think it will be valuable to have a trim utility that doesn't require replay.

I'm not looking for an immediate working, but this still seems like a useful feature.

bradgrantham-lunarg commented 1 year ago

It is a useful feature but at the moment we use live GPU playback to do trimming so offline trimming is going to require some design work. Good point, and I'll leave this issue up marked with "enhancement" as a P1.