Open edisonchan opened 1 year ago
❯ ./vkpeak 2 device = NVIDIA CMP 40HX
fp32-scalar = 9219.28 GFLOPS fp32-vec4 = 9168.58 GFLOPS
fp16-scalar = 9041.08 GFLOPS fp16-vec4 = 17782.56 GFLOPS
fp64-scalar = 281.40 GFLOPS fp64-vec4 = 281.41 GFLOPS
int32-scalar = 8937.01 GIOPS int32-vec4 = 8856.68 GIOPS
int16-scalar = 5879.15 GIOPS int16-vec4 = 7598.19 GIOPS
.\vkpeak.exe 0 device = NVIDIA GeForce RTX 3060
fp32-scalar = 6884.30 GFLOPS fp32-vec4 = 9102.33 GFLOPS
fp16-scalar = 6834.72 GFLOPS fp16-vec4 = 13487.34 GFLOPS
fp64-scalar = 214.61 GFLOPS fp64-vec4 = 215.09 GFLOPS
int32-scalar = 6843.09 GIOPS int32-vec4 = 6814.33 GIOPS
int16-scalar = 4517.13 GIOPS int16-vec4 = 6017.45 GIOPS
RTX 3060 w/527.56 on Windows 11 , it should be about 13 TFLOPS with fp32 FMA.