Open jhumlick opened 1 year ago
In short: thanks for reporting. This is known behavior and is mostly harmless. (I've noted the crash on Ctrl+C while piped as a separate issue #17).
Detailed:
The reason of a problem is a some incompatibility between RADV driver and memtest_vulkan. memtest_vulkan
insists on allocating contiguous buffer (this is a technical design choice), but RADV is failing to allocate such buffer contiguously.
However, the good news: being able to test only 15.0GB out of 24.0 GB is not a problem for nearly all usages, since
So the for 99% cases - this is just fine. The Failed to allocate a buffer
messages above are generated by a driver, and from the memtest_vualkn usage scenario can be just ignored.
I can't make the driver less verbose since it is driver's unconditional stderr output
Actually some other drivers sometimes fails allocations of large contiguous buffers too, memtest_vulkan silently auto-selects a bit smaller, it's just ok.
As a half fix - I plan to detect RADV driver and apply some minor tunes:
Also, there is a chance that the 16GB limit is specified in some limits exposed by the driver, but I'm not sure.
Can you please install the package containing the vulkaninfo
utility (something like vulkan-tools
) and attach the output of VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json vulkaninfo
?
A side note: AMD GPUs on Linux has 2 different opensource vulkan drivers that can be installed simultaneously without severe conflicts. So you can additionally install the AMDVLK driver, and specify the driver to use by a VK_ICD_FILENAMES environment variable: amd_icd64
is AMDVLK, and radeon_icd.x86_64
is the RADV you are using now.
(for your vulkan loader's libvulkan1 version. Newer libvulkan1 loader version renamed that var to VK_DRIVER_FILES).
So, with AMDVLK driver additionally installed you can run
[user@host]$ VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/amd_icd64.json ./memtest_vulkan
[user@host]$ VK_ICD_FILENAMES=/usr/share/vulkan/icd.d/radeon_icd.x86_64.json ./memtest_vulkan
I have no 16+GB AMD GPU, but from my experience with 12 GB RX6700 - AMDVLK allows testing 10.5GB (1.5GB is auto-skipped by memtest_vulkan to avoid desktop lockup), while RADV limits it to 4GB. So I hope that for RX 7900 the AMDVLK can allow going above 16GB.
The same environment var applies to all other vulkan apps, including vulkaninfo
Thanks for the detailed reply!
I had thought I had both drivers installed but upon further inspection, discovered that I only had the RADV driver installed. I had to do a lot of package jumbling in order to get versions of MESA and a kernel that would support this GPU, so I guess I lost the AMDVLK driver in the process. Once I installed the AMD driver again, memtest_vulkan prompted me to select which driver to use. The issue did not appear with the AMDVLK driver, and I was able to stress test my system for an hour or so without any issues.
Thanks again for your help!
Also, thanks for letting me know about the VK_ICD_FILENAMES and VK_DRIVER_FILES environment variables. I previously only knew that I could specify to use the RADV driver via AMD_VULKAN_ICD=RADV
, back when I wasn't using a bleeding-edge MESA and kernel, and had both drivers installed. ;-)
I have a RX 7900 XTX, and it looks like not all of my memory is being tested. When I launch normally, I see:
I have resizable BAR turned on in my bios.
I will attach the output I see running with the file renamed to memtest_vulkan_verbose.
It also appears that the tool crashes if I write to a log file with tee using a pipe. (i.e.
./memtest_vulkan_verbose
| teememtest_vulkan_verbose.txt
will crash when ctrl+c is pressed) memtest_vulkan_verbose.txt