GpuZelenograd / memtest_vulkan

Vulkan compute tool for testing video memory stability
https://github.com/GpuZelenograd/memtest_vulkan/blob/main/Readme.md
zlib License
291 stars 14 forks source link

Runtime error: Failed determining memory budget #11

Open pixartist opened 1 year ago

pixartist commented 1 year ago

Running on Endeavor OS latest I get:

1: Bus=0x0A:00 DevId=0x2206   10GB NVIDIA GeForce RTX 3080
Runtime error: Failed determining memory budget
Using in-process testing method with small memory limit 0
Using in-process testing method
Runtime error: Failed determining memory budget

memtest_vulkan: INIT OR FIRST testing failed due to runtime error
  press any key to continue...
mpjanz commented 10 months ago

Same for me,also EndeavourOS running a Radeon RX580

1: Bus=0x25:00 DevId=0x67DF   8GB AMD Radeon RX 580 Series (RADV POLARIS10)
Runtime error: Failed determining memory budget
Using in-process testing method with small memory limit 0
Using in-process testing method
Runtime error: Failed determining memory budget

memtest_vulkan: INIT OR FIRST testing failed due to runtime error
`
galkinvv commented 10 months ago

In short: please enable verbose mode by renaming the executable to memtest_vulkan_verbose, run again and attach here its output.

Thanks for reporting. The issue looks very strange, however given that there is now 2 reports with different GPU but same distro - It may be some distro-specific.

This is strange too, since I often run memtest_vulkan on Archlinux which has most packages identical to EndeavourOS. Also vulkan is pretty good standartized and - in theory - memtest_vulkan should be distro-agnostic, so I hope to find some clues in the verbose output

bagusnl commented 10 months ago

I have quite a similar error message but different system (using iGPU)

GPU: Vega 3 iGPU (R3 3200U) UMA: 512M RAM: 16GB OS: Windows 11 Latest Insider Beta

 bagusnl_reg   ~      memtest_vulkan_verbose.exe                                            in cmd at 15:16:25
https://github.com/GpuZelenograd/memtest_vulkan v0.5.0 by GpuZelenograd
To finish testing use Ctrl+C
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
Verbose feature enabled (or 'verbose' found in name). Vulkan instance 1.3.261
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Available:
VK_LAYER_AMD_switchable_graphics, VK_LAYER_VALVE_steam_overlay, VK_LAYER_VALVE_steam_fossilize, VK_LAYER_RTSS
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
Extensions: VK_KHR_device_group_creation, VK_KHR_external_fence_capabilities, VK_KHR_external_memory_capabilities, VK_KHR_external_semaphore_capabilities, VK_KHR_get_physical_device_properties2, VK_KHR_get_surface_capabilities2, VK_KHR_surface, VK_KHR_win32_surface, VK_EXT_debug_report, VK_EXT_debug_utils, VK_EXT_swapchain_colorspace, VK_KHR_portability_enumeration, VK_LUNARG_direct_driver_loading

WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
ERROR:            loader_validate_layers: Layer 0 does not exist in the list of available layers
Not using validation layers due to ERROR_LAYER_NOT_PRESENT while getting erupt::generated::InstanceLoader in context instance with validation
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.

1: Bus=0x03:00 DevId=0x15D8 API 1.3.262  0x80011B  1GB AMD Radeon(TM) Vega 3 Graphics
2: Bus=0x03:00 DevId=0x15D8 API 1.3.262  0x80011B  1GB AMD Radeon(TM) Vega 3 Graphics
                                                   Override index to test:1
Loading memory info for selected device index 0...
heap size  0.2GB budget  0.2GB usage  0.0GB flags=DEVICE_LOCAL | MULTI_INSTANCE | MULTI_INSTANCE_KHR
heap size  7.5GB budget  7.1GB usage  0.0GB flags=(empty)
heap size  0.2GB budget  0.2GB usage  0.0GB flags=DEVICE_LOCAL | MULTI_INSTANCE | MULTI_INSTANCE_KHR
Spawned child Child { stdin: None, stdout: None, stderr: None, .. } with PID 34076
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
Verbose feature enabled (or 'verbose' found in name). Vulkan instance 1.3.261
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Available:
VK_LAYER_AMD_switchable_graphics, VK_LAYER_VALVE_steam_overlay, VK_LAYER_VALVE_steam_fossilize, VK_LAYER_RTSS
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
Extensions: VK_KHR_device_group_creation, VK_KHR_external_fence_capabilities, VK_KHR_external_memory_capabilities, VK_KHR_external_semaphore_capabilities, VK_KHR_get_physical_device_properties2, VK_KHR_get_surface_capabilities2, VK_KHR_surface, VK_KHR_win32_surface, VK_EXT_debug_report, VK_EXT_debug_utils, VK_EXT_swapchain_colorspace, VK_KHR_portability_enumeration, VK_LUNARG_direct_driver_loading

WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
ERROR:            loader_validate_layers: Layer 0 does not exist in the list of available layers
Not using validation layers due to ERROR_LAYER_NOT_PRESENT while getting erupt::generated::InstanceLoader in context instance with validation
WARNING:          Didn't find required layer object disable_environment in manifest JSON file, skipping this layer
WARNING | LAYER:  windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
Loading memory info for selected device index 0...
heap size  0.2GB budget  0.2GB usage  0.0GB flags=DEVICE_LOCAL | MULTI_INSTANCE | MULTI_INSTANCE_KHR
heap size  7.5GB budget  7.1GB usage  0.0GB flags=(empty)
heap size  0.2GB budget  0.2GB usage  0.0GB flags=DEVICE_LOCAL | MULTI_INSTANCE | MULTI_INSTANCE_KHR
Runtime error: Failed determining memory budget
Subprocess status exit code: 68 parent_close_requested false
Using in-process testing method with small memory limit 0
Using in-process testing method
Runtime error: Failed determining memory budget

memtest_vulkan: INIT OR FIRST testing failed due to runtime error
  press any key to continue...

vulkaninfo output: https://gist.github.com/bagusnl/eb2125cf9e7c606b62c7dedd659b2753

galkinvv commented 10 months ago

@bagusnl I collapsed he long log and created a separate issue for your case. It seems to be unrelated - caused by total GPU memory size printed as "1GB" in gpu list, while other reports in this thread has total memory fine at 8-10GB in GPU list.

galkinvv commented 5 months ago

heap size 7.8GB budget 0.4GB usage 0.0GB flags=DEVICE_LOCAL from your log - the driver reports that only 0.4GB out of 7.8GB is the allowed memory budget for application

This is strange - I launched RX5700 testing GPU + Arch updated to latest usersapce and get 7.7GB budget in verbose output

1: Bus=0x03:00 DevId=0x731F API 1.3.274  v24(0x6000005)  8GB AMD Radeon RX 5700 (RADV NAVI10)
Loading memory info for selected device index 0...
heap size  7.8GB budget  7.7GB usage  0.0GB flags=DEVICE_LOCAL
heap size  3.0GB budget  3.0GB usage  0.0GB flags=(empty)
heap size  0.2GB budget  0.2GB usage  0.0GB flags=DEVICE_LOCAL
Spawned child Child { stdin: None, stdout: None, stderr: None, .. } with PID 378
Verbose feature enabled (or 'verbose' found in name). Vulkan instance 1.3.279

Whats your kernel version? Do you have another GPU memory-heavy applications running?

Also you can specify memory size to test manually, for example test 2.5GB: ./memtest_vulkan 1 2500000000 Would it be able to allocate and run memory test with a write speed 250-400GB/sec ?

Side note: unlike AVDVLK driver the RADV vulkan driver has another limitation: it would not allow allocating more then 2.5GB contigous block. But this is a known fact and memtest_vulkan automatically selects the allocatable size - this would look like


1: Bus=0x03:00 DevId=0x731F   8GB AMD Radeon RX 5700 (RADV NAVI10)
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 7875674112 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 7456243712 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 7036813312 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 6617382912 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 6197952512 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 5778522112 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 5359091712 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 4939661312 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 4520230912 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 4100800512 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 3681370112 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
radv/amdgpu: Failed to allocate a buffer:
radv/amdgpu:    size      : 3261939712 bytes
radv/amdgpu:    alignment : 65536 bytes
radv/amdgpu:    domains   : 4
Standard 5-minute test of 1: Bus=0x03:00 DevId=0x731F   8GB AMD Radeon RX 5700 (RADV NAVI10)
          NO ERRORS           written:    1.2GB  281.9GB/s      checked:    2.5GB  217.7GB/s     00:00:00.015
galkinvv commented 5 months ago

@hanskalisvaart You deleted your prior comment, does it mean the issue was somehow resolved for your Arch + rx5700?

hanskalisvaart commented 5 months ago

@hanskalisvaart You deleted your prior comment, does it mean the issue was somehow resolved for your Arch + rx5700?

Yes I got the error by user error. Apparently, you also get that error when you also open memtest_vulkan with your mouse. Not notice that it started the memtest because of noise-canceling headphones. And then start the memtest_vulkan in the console. When I restarted my pc and just started it in the console, I did not get that error, and it just did the memtest.

galkinvv commented 5 months ago

Thanks for reply. The problem with "non-visible background running" is important feedback - I just realized that such situation can be quite common. By now the Linux part of readme is updated, IN the future I'll try to detect such cases and don't perform a non-visible background run when executed without explicit arguments.