nihui / vkpeak

A tool which profiles Vulkan devices to find their peak capacities
MIT License
99 stars 3 forks source link

vkpeak crashes at vkMapMemory() on APUs with large dedicated GPU memory allocations set in BIOS (>512MB) #8

Closed hjc4869 closed 1 year ago

hjc4869 commented 1 year ago

vkpeak tries to allocate 3 ncnn::VkMat instances here. The allocation size is derived from VulkanDevice::get_heap_budget().

VulkanDevice::get_heap_budget() returns memory budget from the first heap with VK_MEMORY_HEAP_DEVICE_LOCAL_BIT flag. In the case of AMD APUs, this returns the heap representing dedicated GPU memory allocated by the BIOS.

When ncnn::VkMat does the actual allocation however, it uses VkBlobAllocator::fastMalloc(). It handles integrated GPU devices in a special way, specifically requiring VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT and preferring VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT.

As a result, VkMat allocates memory from the 3rd heap with only 256MB of capacity. This causes allocation failures in VkAllocator::allocate_memory() and NULL being passed into vkMapMemory() in VkBlobAllocator::fastMalloc(), resulting 0xc0000005 in amdvlk64.dll.

There's also potential memory performance issue, as the 256MB (device local | host visible) memory on APUs is not cacheable.

nihui commented 1 year ago

hi please test if this fix works for you https://github.com/Tencent/ncnn/pull/4936

hjc4869 commented 1 year ago

Tested. It allocated from the expected heap and no longer crashed. Thanks for the fix.

nihui commented 1 year ago

fixed release https://github.com/nihui/vkpeak/releases/tag/20230812