google / uVkCompute

A micro Vulkan compute pipeline and a collection of benchmarking compute shaders
Apache License 2.0
224 stars 38 forks source link

Benchmark mad crash on Jetson Nano #18

Open tpoisonooo opened 2 years ago

tpoisonooo commented 2 years ago

mad_throughput crashed on Jetson Nano and throw VK_ERROR_DEVICE_LOST.

This is call chain:

mad_throughput_main.cc:189  --->   GetDeviceBufferViaStagingBuffer -->  vulkan_buffer_util.cc:67 ---->  QueueSubmitAndWait ---> crash

No nullptr or bad variable found.

I have tried to fix it by validation layer, but Jetson Nano does not support it ... 0 == layerCount

$ vulkaninfo
Instance Extensions:
====================
Instance Extensions count = 16
    VK_KHR_device_group_creation        : extension revision  1
    VK_KHR_display                      : extension revision 23
    VK_KHR_external_fence_capabilities  : extension revision  1
    VK_KHR_external_memory_capabilities : extension revision  1
    VK_KHR_external_semaphore_capabilities: extension revision  1
    VK_KHR_get_display_properties2      : extension revision  1
    VK_KHR_get_physical_device_properties2: extension revision  2
    VK_KHR_get_surface_capabilities2    : extension revision  1
    VK_KHR_surface                      : extension revision 25
    VK_KHR_surface_protected_capabilities: extension revision  1
    VK_KHR_wayland_surface              : extension revision  6
    VK_KHR_xcb_surface                  : extension revision  6
    VK_KHR_xlib_surface                 : extension revision  6
    VK_EXT_debug_report                 : extension revision  9
    VK_EXT_debug_utils                  : extension revision  1
    VK_EXT_display_surface_counter      : extension revision  1
Layers: count = 0

this is my draft PR https://github.com/google/uVkCompute/pull/17

antiagainst commented 2 years ago

VK_ERROR_DEVICE_LOST is an indication that the workload is taking too much time to complete on the GPU (given we have a weak GPU on Jetson Nano I think). You can try to reduce the amount of workload to see if it helps.