Closed PMunkes closed 1 week ago
The Windows Vulkan driver in Radeon Software 22.6.1 provides identical performance to AMDGPU-Pro:
vkpeak.exe 0
device = AMD Radeon RX 6700 XT
fp32-scalar = 13030.38 GFLOPS
fp32-vec4 = 12972.51 GFLOPS
fp16-scalar = 12173.23 GFLOPS
fp16-vec4 = 21743.44 GFLOPS
fp64-scalar = 820.97 GFLOPS
fp64-vec4 = 821.68 GFLOPS
int32-scalar = 2632.45 GIOPS
int32-vec4 = 2627.50 GIOPS
int16-scalar = 12169.22 GIOPS
int16-vec4 = 11854.09 GIOPS
llpc currently runs the scalarizer pass, which helps a lot in reducing register pressure and increasing occupancy but has the side-effect of preventing packed instructions. See this issue for more details and workarounds: https://github.com/GPUOpen-Drivers/llpc/issues/1369
The int16-vec4 issue is fixed and the fp16-vec4 issue is not reproducible with 2024.Q3.2 release.
Using the latest release of AMDVLK included in the 22.20 driver for Ubuntu 22.04 only gives single rate performance when testing with vkpeak. AMDGPU-PRO from the same package provides support for packed FP16 support, but not for int16. RADV recently had support for both packed fp16 and int16 merged. Merge Request Are there plans to provide support for double rate 16 bit instructions in the Open Source driver?
AMDVLK:
AMDGPU-Pro:
RADV (git-f533dff 2022-07-01 jammy-oibaf-ppa):