Closed SRSaunders closed 3 months ago
Thanks for the hints re VkPhysicalDeviceProperties::driverVersion
and VkLayerProperties::implementationVersion
for retrieving the MoltenVK version at runtime.
However, after thinking about this problem a bit more I will be reverting back to the simple approach of guarding the new code at compile time using MVK_MAKE_VERSION(1, 2, 9)
, and checking for VK_SUCCESS
returned from vkGetPerformanceStatisticsMVK()
at runtime. My logic is since the perf structure is not a linear list, the runtime and compile time ABI versions (i.e. struct sizes) have to match exactly to get the correct offsets into the returned memory blob. Even if you are dynamically linking to a later version of MoltenVK (i.e. with previous
defined), the struct sizes must still match exactly for the data to make sense. Rather than carry around multiple version copies of the performance struct in my code (with different versioned offsets), I plan to simply require a match. This is the downside of using a nested struct vs. a linear list, but that's a more general issue than I want to tackle here.
FYI, I looked at mvkCopyGrowingStruct()
to confirm how it behaves, and unfortunately I think I have found an error present in the code:
The current logic is: return (*pCopySize == origSize) ? VK_SUCCESS : VK_INCOMPLETE;
Shouldn't it instead be: return (sizeof(S) == origSize) ? VK_SUCCESS : VK_INCOMPLETE;
The original logic might be okay for linearly growing lists, but is not okay for nested structures like performance data. The only concern I have is this function is also used for MVKConfiguration
and MVKPhysicalDeviceMetalFeatures
which I think are both linear. While the current behaviour is different from the function's doc statement (_Returns VK_SUCCESS if the original value of *pCopySize is the same as the actual size of the struct, or VKINCOMPLETE otherwise), I don't want to break something unintentionally. Please advise.
If you agree with this change, I will submit commits for this as well as bumping MVK_PRIVATE_API_VERSION
.
I decided to make the proposed changes above and @billhollings can review and comment as needed.
FYI - I have one more thing I would potentially like to add to this - a new MVKQueuePerformance
entry that tracks async encoding start delay if the thread is already busy encoding the previous frame. I need to research this a bit more before confirming and will come back with an answer asap. So no rush on the merge.
I have completed my research and have submitted an additional commit that proposes and implements two new performance counters in MVKQueuePerformance
for measuring asynchronous queue submit wait times: waitSubmitCommandBuffers
, and waitPresentSwapchains
. These new performance counters are useful when MVK_CONFIG_SYNCHRONOUS_QUEUE_SUBMITS
is disabled and queue submits are asynchronous, since there are delays between the initial calls to vkQueueSubmit()/vkQueuePresentKHR()
and the actual start of operations.
For instance, waitSubmitCommandBuffers
allows you to determine the delay time before the start of command buffer encoding if the thread is already busy encoding the previous frame. And waitPresentSwapchains
allows you to predict the delay between calling vkQueuePresentKHR()
and the actual swapchain image showing up on the screen - e.g. waitPresentSwapchains
+ presentSwapchains
.
Here are screen grabs of the embedded Optick profiler running inside RBDoom3BFG with MoltenVK. You can see the Submit_Wait times (corresponds to waitSubmitCommandBuffers
), Acquire_Wait times (corresponds to retrieveCAMetalDrawable
), and Present_Wait times (corresponds to waitPresentSwapchains
) and their relationship to command buffer encoding and execution on the GPU.
I have now completed my work on this PR and it is ready for review.
Addresses #2177.
This small PR adds a
previous
member to theMVKPerformanceTracker
struct and saves the formerlatest
value before updatinglatest
withinupdateActivityPerformance()
. It also addsprevious
toMVKLogInfo
.Note that I did not change the
MVK_PRIVATE_API_VERSION
.I still have a question about how to determine version information at runtime. Now that
vkGetVersionStringsMVK()
is deprecated, what is the offical means of determining MoltenVK's version or theMVKPerformanceStatistics
version at runtime? This is to handle cases when dynamic linking is used and one needs to determine if the new field is present.Is the only mechanism to use the
VK_SUCCESS
orVK_INCOMPLETE
return code fromvkGetPerformanceStatisticsMVK()
? This approach seems limiting since all you get is a success or failure indication and not version information. For instance if you always want to uselatest
, and optionally useprevious
(if available), how would you go about determing that at runtime?