macOS/Metal inference support

To fully harness the power of Mac, especially on M Chips, integrating Metal backend is key. The core task ahead is adapting our key sparse operators, including mul_mat_sparse and axpy, to Metal, mirroring their current CUDA implementations.

Yet, the distinct programming model of Metal compared to CUDA presents a unique set of challenges. We warmly invite the community to join us in this endeavour. Whether you have insights on Metal, experience with CUDA, or suggestions for this migration, your expertise can significantly impact PowerInfer's performance on Macs! 💪

SJTU-IPADS / PowerInfer

macOS/Metal inference support #88