google / uVkCompute

A micro Vulkan compute pipeline and a collection of benchmarking compute shaders
Apache License 2.0
224 stars 38 forks source link

[matmul] Add basic i8->i32 matmul tiled for inner product #30

Closed kuhar closed 1 year ago

kuhar commented 1 year ago

This is an initialized tiled implementation that could use integer dot product instructions (depending on how the driver compiler).

It achieves ~190 GFLOps, compared to ~230 with i8->i32 outer product and ~345 with i8->f32->i32 outer product.