issues
search
mit-han-lab
/
TinyChatEngine
TinyChatEngine: On-Device LLM Inference Library
https://mit-han-lab.github.io/TinyChatEngine/
MIT License
717
stars
68
forks
source link
Support w4a16 LLaMA on CUDA GPUs
#28
Closed
RaymondWang0
closed
1 year ago