Open RenfengLiu opened 3 years ago
There should be a call to GpaBeginCommandList(vk_primary_cmd_buffer, gpa_primary_cmd_buffer)
, then the call to vkCmdExecuteCommands
, and then GpaCopySecondarySamples(gpa_secondary_cmd_buffer, gpa_primary_cmd_buffer, <num samples>, <new sample IDs>)
afterwards. This will ensure the sample results get copied to a secondary result buffer so they do not get overwitten by subsequent calls to vkCmdExecuteCommands
.
When collecting sample results, the original sample Ids collected inside vkCmdExecuteCommands
will not have proper results, and you should use the new sample IDs that were passed in to the GpaCopySecondarySamples
call.
We do not currently have a sample of this for Vulkan, but the DX12ColorCube app has an example here: https://github.com/GPUOpen-Tools/gpu_performance_api/blob/30cd97819afd6f560a2dcd6847f5dde1fba08854/source/examples/dx12/dx12_color_cube/cube_sample.cc#L1090 Note: Unfortunately profiling of DX12 bundles does not currently work due to a change in the driver, but I believe the equivalent functionality should work in Vulkan. If you have any trouble getting the results, we'll be happy to help.
Thanks for the quick response.
I'm experimenting with the command_buffer_usage
sample (with secondary command buffer) from https://github.com/KhronosGroup/Vulkan-Samples
with the GPA libarary and the AMDVLK driver. I build the driver in debug mode.
I was able to get counter values using GPA libaray from primary command buffer, but when trying with secondary command buffer, with the call sequence you mentioned I get the following asserts:
#1 0x00007fffefdf9c26 in Pal::CmdBuffer::WriteEvent (this=0x555557b1f5d0, gpuEvent=...,
pipePoint=Pal::HwPipeBottom, data=data@entry=3735928559)
at /dir/driver/pal/src/core/cmdBuffer.cpp:853
#2 0x00007fffefccbe54 in Pal::CmdBuffer::CmdSetEvent (this=<optimized out>, gpuEvent=...,
setPoint=<optimized out>) at /dir/driver/pal/src/./core/cmdBuffer.h:494
#3 0x00007fffefdc067e in GpuUtil::GpaSession::CopyResults (this=0x555559b67248, pCmdBuf=0x555557b1f5d0)
at /dir/driver/pal/src/gpuUtil/gpaSession.cpp:2176
#4 0x00007fffefc3fd53 in vk::GpaSession::CmdCopyResults (this=<optimized out>, pCmdBuf=<optimized out>)
at /dir/driver/xgl/icd/api/vk_gpa_session.cpp:314
#5 0x00007fffefc3fe2e in vk::entry::vkCmdCopyGpaSessionResultsAMD (commandBuffer=<optimized out>,
gpaSession=<optimized out>) at /dir/driver/xgl/icd/api/vk_gpa_session.cpp:432
#6 0x00007fffee55297d in VkGpaCommandList::CopySecondarySamples (this=0x5555586fc820,
primary_command_list=0x555557249ec0, num_samples=55, new_sample_ids=0x555558efdbd0,
original_sample_ids=std::vector of length 55, capacity 64 = {...})
at /dir/third_party/gpu_performance_api/source/gpu_perf_api_vk/vk_gpa_command_list.cc:245
#7 0x00007fffee558ed4 in VkGpaPass::CopySecondarySamples (this=0x7fffdc04c5b0,
secondary_vk_gpa_command_list=0x5555586fc820, primary_vk_gpa_command_list=0x555557249ec0, num_samples=55,
new_sample_ids=0x555558efdbd0)
at /dir/third_party/gpu_performance_api/source/gpu_perf_api_vk/vk_gpa_pass.cc:319
#8 0x00007fffee55ac21 in VkGpaSession::CopySecondarySamples (this=0x7fffdc04e950,
secondary_command_list_id=0x5555582e6280, primary_command_list_id=0x5555582a8b40, num_samples=55,
new_sample_ids=0x555558efdbd0)
at /dir/third_party/gpu_performance_api/source/gpu_perf_api_vk/vk_gpa_session.cc:77
#9 0x00007fffee41a151 in GpaCopySecondarySamples (secondary_gpa_command_list_id=0x5555582e6280,
primary_gpa_command_list_id=0x5555582a8b40, number_of_samples=55, new_sample_ids=0x555558efdbd0)
at /dir/third_party/gpu_performance_api/source/gpu_perf_api_common/gpu_perf_api.cc:1394
specificly:
#1 0x00007fffefdf9c26 in Pal::CmdBuffer::WriteEvent (this=0x555557b1f5d0, gpuEvent=..., pipePoint=Pal::HwPipeBottom, data=data@entry=3735928559) at /dir/driver/pal/src/core/cmdBuffer.cpp:853
853 PAL_ASSERT_ALWAYS();
Do you have any idea on what may be the problem here?
It appears the change that affected DX12 bundles also affected the support in Vulkan, as this is now part of the shared codebase. The secondary command lists are now handled in a different manner within the driver and the change caused our profiling extensions to not work properly. Unfortunately you will not be able to easily get this working. I will raise the priority of this with our driver teams.
As an alternative GPUPerfAPI is already integrated into RenderDoc. If you use RenderDoc to capture and profile applications with secondary command lists, I believe it will work correctly. Most capture / replay tools will record the calls that are inside the secondary command list, and then substitute them in place of the VkCmdExecuteCommands call. Since the calls are now actually being replayed on the primary command list, the profiling is able to work correctly.
Sorry for the inconvenience, and hopefully you can get what you need via RenderDoc.
The document only states that for secondary command buffer we need to call this, but didn't state when is the correct time to call this. Should I call it after the
vkCmdExecuteCommands
for primary command buffer orvkEndCommandBuffer
or any other places? Is there an example for this?