ggerganov / llama.cpp

LLM inference in C/C++
MIT License
68.19k stars 9.78k forks source link

Feature Request: Support BitNet.cpp quantization format #10179

Open luionTW opened 2 weeks ago

luionTW commented 2 weeks ago

Can the team support the BitNet.cpp, that is another one pure 1 bit model. https://arxiv.org/pdf/2310.11453

Motivation

The new quantization can help the edge computing improvement .

Possible Implementation

No response

eugenehp commented 3 days ago

Here's a fork of llama.cpp used in the Microsoft's BitNet repo. It would be great to upstream the changes.