SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
MIT License
7.96k stars 412 forks source link

macOS/Metal inference support #88

Open hodlen opened 10 months ago

hodlen commented 10 months ago

To fully harness the power of Mac, especially on M Chips, integrating Metal backend is key. The core task ahead is adapting our key sparse operators, including mul_mat_sparse and axpy, to Metal, mirroring their current CUDA implementations.

Yet, the distinct programming model of Metal compared to CUDA presents a unique set of challenges. We warmly invite the community to join us in this endeavour. Whether you have insights on Metal, experience with CUDA, or suggestions for this migration, your expertise can significantly impact PowerInfer's performance on Macs! 💪