PygmalionAI / aphrodite-engine

Large-scale LLM inference engine
https://aphrodite.pygmalion.chat
GNU Affero General Public License v3.0
998 stars 112 forks source link

[Bug]: exl2 is not auto detected #331

Closed nivibilla closed 6 months ago

nivibilla commented 6 months ago

Your current environment

N/A

🐛 Describe the bug

Loading without specifying --quantization exl2 tries to load the model with quantisation mode None. Manually specifying that it is an exl2 quant works.

AlpinDale commented 6 months ago

Unfortunately, it's a bit difficult to auto-detect exl2 - it's the only quant format that does not ship a quantization config file, so the guesswork won't always work. I requested turboderp to have exl2 quants ship the config too, and it seems he's doing that now. Future exl2 quants shouldn't have this problem, hopefully.

nivibilla commented 6 months ago

Ah okay np. Thanks for the quick reply!