octoml / mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
https://mlc.ai/mlc-llm
Apache License 2.0
5 stars 8 forks source link

[SmoothQuant] Initial implementation for FP8/Int8 #257

Closed ibsidorenko closed 7 months ago

ibsidorenko commented 7 months ago

Deelvin/smoothquant integration. This PR brings support of SmoothQuant to MLC-llm