ml-explore / mlx

MLX: An array framework for Apple silicon
https://ml-explore.github.io/mlx/
MIT License
15.01k stars 856 forks source link

Fused GEMM #1123

Closed jagrit06 closed 2 weeks ago

jagrit06 commented 2 weeks ago

Proposed changes

Fuses gemm, addmm, and block_sparse_mm into one Uber-shader that uses metal function constants for specialization No performance regression has been seen on a M2 Ultra

Checklist

Put an x in the boxes that apply.