microsoft / T-MAC

Low-bit LLM inference on CPU with lookup table
MIT License
545 stars 39 forks source link

是否可以在llama.cpp中体验该项目的优化功能 #13

Closed CsBoBoNice closed 2 months ago

CsBoBoNice commented 2 months ago

您好,

很高兴在readme上看到以下动态

08/14/2024 🚀: The T-MAC GEMM (N>1) kernels are now integrated into llama.cpp to accelerate prefill.

我在llama.cpp项目中未找到与该项目相关的的更新说明与pr(可能是我没看仔细)

请问是否可以在llama.cpp中体验该项目的优化功能

期待您的回复!

kaleid-liner commented 2 months ago

The feature is implemented in this commit. We haven't issued any pull request to the mainstream. Follow README.md to use our own fork.