I am facing a vk ErrorDeviceLost pb.
The code is borrowed from the test_vk/internal_resources, but here I am casting rays in a loop where the number of rays is decreased by 2 at each iteration.
Also, if I save the Hits data at each iteration, the program completes without raising any problem, it is like allowing more time gap to the GPU hides this problem in some way.
Ubuntu 20.04
Vulkan version 1.2.182
Nvidia driver 460.73.1 on QuadroM1200
RR4.1 build options : ENABLE_VULKAN=ON, EMBEDDED_KERNELS=ON, ENABLE_TESTING=ON, CMAKE_POSITION_INDEPENDENT_CODE=ON
Any suggestion how to debug this ?
Am I doing this the right way ?
Hello,
Thanks AMD folks for this repo.
I am facing a vk ErrorDeviceLost pb. The code is borrowed from the test_vk/internal_resources, but here I am casting rays in a loop where the number of rays is decreased by 2 at each iteration.
The loop fails at last iteration where an assertion halts, this is the log output:
[2021-07-17 00:24:19.903] [RR logger] [info] rrCreateContext(1001) [2021-07-17 00:24:19.903] [RR logger] [info] Creating Vulkan context [2021-07-17 00:24:21.709] [RR logger] [info] rrSetLogLevel(1) ... ... hitcount:390625 [2021-07-17 00:24:35.067] [RR logger] [info] rrUnmapDevicePtr [2021-07-17 00:24:35.067] [RR logger] [debug] Unmap vulkan buffer [2021-07-17 00:24:35.067] [RR logger] [info] rrReleaseDevicePtr [2021-07-17 00:24:35.069] [RR logger] [debug] Device pointer successfully released [2021-07-17 00:24:35.069] [RR logger] [info] rrReleaseDevicePtr [2021-07-17 00:24:35.078] [RR logger] [debug] Device pointer successfully released [2021-07-17 00:24:35.078] [RR logger] [info] rrReleaseDevicePtr [2021-07-17 00:24:35.079] [RR logger] [debug] Device pointer successfully released // next iteration [2021-07-17 00:24:35.079] [RR logger] [info] rrAllocateDeviceBuffer [2021-07-17 00:24:35.081] [RR logger] [debug] Allocated vulkan buffer with size 3115008 [2021-07-17 00:24:35.081] [RR logger] [info] rrMapDevicePtr [2021-07-17 00:24:35.081] [RR logger] [debug] Map vulkan buffer [2021-07-17 00:24:35.082] [RR logger] [info] rrUnmapDevicePtr [2021-07-17 00:24:35.082] [RR logger] [debug] Unmap vulkan buffer ray init 0.00319416s for 97344 rays [2021-07-17 00:24:35.082] [RR logger] [info] rrAllocateDeviceBuffer [2021-07-17 00:24:35.082] [RR logger] [debug] Allocated vulkan buffer with size 1557504 [2021-07-17 00:24:35.082] [RR logger] [info] rrGetTraceMemoryRequirements [2021-07-17 00:24:35.082] [RR logger] [debug] Successfully provided trace memory requirements [2021-07-17 00:24:35.082] [RR logger] [info] rrAllocateDeviceBuffer [2021-07-17 00:24:35.088] [RR logger] [debug] Allocated vulkan buffer with size 24920064 [2021-07-17 00:24:35.088] [RR logger] [info] rrAllocateCommandStream [2021-07-17 00:24:35.088] [RR logger] [debug] Command stream successfully allocated [2021-07-17 00:24:35.088] [RR logger] [info] rrCmdIntersect [2021-07-17 00:24:35.088] [RR logger] [debug] Intersector::Intersect() [2021-07-17 00:24:35.088] [RR logger] [debug] Batch intersect command successfully recorded [2021-07-17 00:24:35.088] [RR logger] [info] rrSumbitCommandStream [2021-07-17 00:24:35.088] [RR logger] [debug] Device::SubmitCommandStream() [2021-07-17 00:24:35.088] [RR logger] [debug] Command stream successfully submitted [2021-07-17 00:24:35.089] [RR logger] [info] rrReleaseEvent [2021-07-17 00:24:35.089] [RR logger] [debug] Device::WaitEvent() [2021-07-17 00:24:35.469] [RR logger] [error] vk::Device::waitForFences: ErrorDeviceLost rfrt: src/main3.cpp:168: void test(): Assertion `(rrWaitEvent(context, wait_event)) == RR_SUCCESS' failed. Aborted (core dumped)
Also, if I save the Hits data at each iteration, the program completes without raising any problem, it is like allowing more time gap to the GPU hides this problem in some way.
Ubuntu 20.04 Vulkan version 1.2.182 Nvidia driver 460.73.1 on QuadroM1200 RR4.1 build options : ENABLE_VULKAN=ON, EMBEDDED_KERNELS=ON, ENABLE_TESTING=ON, CMAKE_POSITION_INDEPENDENT_CODE=ON
Any suggestion how to debug this ? Am I doing this the right way ?
Cheers