mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.3k stars 172 forks source link

Adding Mistral 7B #95

Open uprokevin opened 11 months ago

uprokevin commented 11 months ago

Checking the constraints to add Mistral 7B in the list of models.

It seems it has been benchmarked with auto-awq :

https://github.com/casper-hansen/AutoAWQ

tonylins commented 11 months ago

Thanks for the question. I do not think it would be hard to support Mistral. We will take a look when there is bandwidth.

You may also try modifying the code to support Mistral following the logistics here: https://github.com/mit-han-lab/llm-awq/blob/main/awq/quantize/pre_quant.py

cassianlewis commented 10 months ago

Any update on this?

xinyual commented 8 months ago

Any update?