Open oscarbg opened 3 years ago
leaving results on my Vega on MacOS: device = AMD Radeon RX Vega 64
fp32-scalar = 11544.38 GFLOPS fp32-vec4 = 10986.99 GFLOPS
fp16-scalar = 10465.67 GFLOPS fp16-vec4 = 21179.84 GFLOPS
fp64-scalar = 0.00 GFLOPS fp64-vec4 = 0.00 GFLOPS
int32-scalar = 2207.93 GIOPS int32-vec4 = 2199.83 GIOPS
int16-scalar = 10626.61 GIOPS int16-vec4 = 18983.10 GIOPS
comments:
so briefly on Macos 16 bit precision int and float vector4 gets 2x faster than scalar (AMD rapid packed math :-))..
Hi, nice benchmark! below my Titan V and RX Vega Win results.. AFAIK Vulkan spec supports also int8 (via VK_KHR_shader_float16_int8 shaderInt8) and int64 (shaderInt64).. any plan on support benchmarking int8/64 throughput? thanks..
Results:
device = NVIDIA TITAN V
fp32-scalar = 17230.91 GFLOPS fp32-vec4 = 16898.01 GFLOPS
fp16-scalar = 16781.96 GFLOPS fp16-vec4 = 32568.21 GFLOPS
fp64-scalar = 7664.02 GFLOPS fp64-vec4 = 7677.14 GFLOPS
int32-scalar = 14464.71 GIOPS int32-vec4 = 14755.26 GIOPS
int16-scalar = 9727.97 GIOPS int16-vec4 = 11768.93 GIOPS
device = Radeon RX Vega
fp32-scalar = 11453.46 GFLOPS fp32-vec4 = 11010.15 GFLOPS
fp16-scalar = 10388.36 GFLOPS fp16-vec4 = 17744.94 GFLOPS
fp64-scalar = 686.59 GFLOPS fp64-vec4 = 686.31 GFLOPS
int32-scalar = 2188.62 GIOPS int32-vec4 = 2170.05 GIOPS
int16-scalar = 10013.59 GIOPS int16-vec4 = 9885.89 GIOPS