RobertBeckebans / RBDOOM-3-BFG

Doom 3 BFG Edition source port with updated DX12 / Vulkan renderer and modern game engine features
https://www.moddb.com/mods/rbdoom-3-bfg
GNU General Public License v3.0
1.37k stars 244 forks source link

macOS: Show MoltenVK submit, image acquire, and metal encode on Optick trace #871

Closed SRSaunders closed 1 month ago

SRSaunders commented 1 month ago

This PR is a macOS-only enhancement that shows MoltenVK's processing periods on the Optick trace. This information is quite interesting since it shows waiting periods for: a) Vulkan command buffer async submit, b) Vulkan/Metal image acquire (related to image present), and c) Vulkan to Metal encoding prior to execution on the GPU.

This also adds a new capability for Optick: the ability to add custom tags to Optick custom storage. This enhancement has already been submitted to the optik project at https://github.com/bombomby/optick/pull/196.

This also fixes a minor HUD bug that would result in MoltenVK's encoding time showing 0 when Optick tracing is active.

UPDATE: Also fixes a new issue with cmake >= 3.29 where Apple .xcframework bundles are now supported. This causes issues with find_package() for libMoltenVK.dylib in recent Vulkan SDK versions (1.3.275 and later). Solution is to set(CMAKE_FIND_FRAMEWORK LAST) which prioritizes dylibs over .framework and .xcframework bundles.

This PR is compatible with current Vulkan SDK and MoltenVK releases, but to see the additional Optick traces you will need MoltenVK 1.2.9 (current dev stream) where additional performance statistics are available to the application (see merged PR https://github.com/KhronosGroup/MoltenVK/pull/2183). This new Optick feature can be tested now using local builds of MoltenVK main, and will be generally available when using the next major release of the Vulkan SDK (i.e. the one after 1.3.280).

Here are some Optick traces that show the enhancement:

Vulkan command buffer submit scheduling wait (vsync off): Screenshot 2024-03-23 at 11 04 30 AM

Vulkan/Metal image acquire wait (vsync on) - note relationship to Present timing: Screenshot 2024-03-23 at 11 05 10 AM

Vulkan-to-Metal encoding time (vsync off) - note relationship to GPU execution: Screenshot 2024-03-23 at 11 06 24 AM

The above traces are shown with r_mvkSynchronousQueueSubmits set to false (the default) for maximum fps. If you set this cvar to true, then latency is reduced but at the expense of performance. In some sense this is the macOS equivalent to setting the DX12 cvar r_maxFrameLatency = 1 in the proposed PR https://github.com/RobertBeckebans/RBDOOM-3-BFG/pull/784. Here is an example trace with r_mvkSynchronousQueueSubmits = true, where queue submit to present time is minimized:

Screenshot 2024-03-23 at 11 46 46 AM