Closed the-crypt-keeper closed 4 months ago
https://github.com/GreenBitAI/low_bit_llama
2 bit quants with performance figures that are difficult to believe.
Model: https://huggingface.co/GreenBitAI/LLaMA-2-70B-2bit-groupsize8
the quip-sharp method #122 is newer and has better 2-bit perplexity, going to leave this open but focus 2-bit efforts there for now
Closing out all old 2-bit quants.
https://github.com/GreenBitAI/low_bit_llama
2 bit quants with performance figures that are difficult to believe.
Model: https://huggingface.co/GreenBitAI/LLaMA-2-70B-2bit-groupsize8