Support w4a16 with CUDA GPU - Githubissues

mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

https://mit-han-lab.github.io/TinyChatEngine/

MIT License

720 stars 68 forks source link

Support w4a16 with CUDA GPU #7

Closed meenchen closed 1 year ago