Closed SRSaunders closed 6 months ago
I am not sure if you would consider this proposed performance tracker change to be general enough, but given it's simplicity I thought I would ask.
Yup. This looks good, and we can definitely add it.
Since you've already got it working, do you want to submit a PR?
If you do, I suggest moving previous
to between latest
and average
, to visually couple previous
to latest
, as a way to help others understand the structure.
Thanks very much and I'd be happy to submit a PR. Ironically, when I first made the change in my private build, I did place the previous
member between latest
and average
, but changed it later due to this comment:
* MoltenVK performance. You can retrieve a copy of this structure using the vkGetPerformanceStatisticsMVK() function.
*
* This structure may be extended as new features are added to MoltenVK. If you are linking to
* an implementation of MoltenVK that was compiled from a different MVK_PRIVATE_API_VERSION
* than your app was, the size of this structure in your app may be larger or smaller than the
* struct in MoltenVK. See the description of the vkGetPerformanceStatisticsMVK() function for
* information about how to handle this.
*
* TO SUPPORT DYNAMIC LINKING TO THIS STRUCTURE AS DESCRIBED ABOVE, THIS STRUCTURE SHOULD NOT
* BE CHANGED EXCEPT TO ADD ADDITIONAL MEMBERS ON THE END. EXISTING MEMBERS, AND THEIR ORDER,
* SHOULD NOT BE CHANGED.
Is this something you want to respect for the MVKPerformanceTracker
struct, which is then combined into other structs that eventually culminate in MVKPerformanceStatistics
? Or would you prefer the more logical grouped organization you suggest above?
In general, I wonder whether you should define an expected size in mvk_private_api.h for current and future versions of MVKPerformanceStatistics
to enable runtime checking as per above. And for each change in the structs, a new expected revision and size would be added to the header file. This is an approach that is sometimes used for system files that add additional fields over time (e.g. mach/task.h with TASK_VM_INFO_REVX_COUNT
). Or would you prefer to take a different approach, and instead encourage users to use MVK_PRIVATE_API_VERSION
to check for runtime compatibility regarding new MVKPerformanceStatistics
and related struct entries? Can MVK_PRIVATE_API_VERSION
be queried at runtime? If you pick the latter (and if runtime query support is available), then my proposed change should bump the MVK_PRIVATE_API_VERSION
. What approach would you like submitted with my PR?
due to this comment
Ah. Of course.
The uncapitalized section of that comment is valid, to avoid buffer-overflow types of crashes. But the capitalized content ("TO SUPPORT...
") is actually misguided, because these are nested structures, and the act of adding previous
is going to add content all throughout the middle of the overall MVKPerformanceStatistics
structure, regardless of where it appears in MVKPerformanceTracker
. That capitalized requirement has meaning for linear API structures (like MVKPhysicalDeviceMetalFeatures
or MVKConfiguration
).
So, please go ahead with adding previous
between latest
and average
, and also please remove the capitalized ("TO SUPPORT...
") comment for MVKPerformanceStatistics
.
Sorry for the confusion with that comment.
Thanks for accommodating this enhancement and merging my PR into main. I have tested it and all is working well.
I would like to make a small feature request for the MoltenVK performance tracker: Add a
.previous
field to theMVKPerformanceTracker
struct.I have run into a situation where I need both the previous and latest stats when submitting multiple (2 in my case) command buffers per frame: the first for game rendering, and the second for the embedded Optick profiler agent within RBDoom3BFG. When the profiler agent is active, it is the last command buffer processed for the frame and unfortunately this wipes out the more valuable
.latest
information for actual game rendering. I have been using the.latest
field for displaying MoltenVK's encoder time in an on-screen HUD when the profiler is inactive (i.e. only 1 command buffer per frame), but I would also like to show encode time in the profiler trace when the profiler agent is active. Unfortunately I cannot seem to capture the.latest
information for the rendering command buffer alone since both buffers sync on the same end-of-frame event.I have prototyped a simple potential solution as follows:
This gives a very nice result (see trace labeled _MetalEncode) in the Optick Profiler traces below. Note by doing this I have learned quite a bit about timing and duration of the encoding step, and how this influences frame rate. For instance, the two samples below are for an identical game scene, but the second is much more performant (~4x frame rate) due the application of Vulkan push constants vs. using constant buffers for shader parameters. Both encoding time and GPU rendering time are drastically reduced when using push constants vs. constant buffers - and having visibility in the Optick profiler makes that very clear.
I am not sure if you would consider this proposed performance tracker change to be general enough, but given it's simplicity I thought I would ask.
(FYI - the game with profiler agent is running on macOS, but the profiler display client is Windows-only, thus I am using Parallels for that)