Quantize model - Githubissues

mit-han-lab / TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library

https://mit-han-lab.github.io/TinyChatEngine/

MIT License

751 stars 73 forks source link

Open ztachip opened 1 month ago

ztachip commented 1 month ago

The repo provides some pre-quantized model ready for download But how do I quantize it myself. What would be the procedure?