SNU-ARC any-precision-llm issues - Githubissues

SNU-ARC / any-precision-llm

[ICML 2024 Oral] Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs

MIT License

83 stars 3 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

No real speedup from any-precision-llm kernels

#7 pgimenes opened 2 months ago
2
Question regarding 2-bit quantization

#6 jusjinuk closed 2 months ago
3
2-bit dequant kernel

#5 fwtan closed 3 months ago
2
Why is 2-bit quantization not supported by the kernel?

#4 hmarkc closed 4 months ago
1
No speedup from quantization

#3 jiwonsong-dev closed 4 months ago
4
TODO - 2024 March

#2 SyphonArch closed 6 months ago
0
Gemma-7B Perplexity Issue

#1 SyphonArch closed 9 months ago
1