mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.08k stars 150 forks source link

Update pyproject.toml #156

Closed Louym closed 3 months ago

Louym commented 3 months ago

When running AWQ search foe Llama2 with transformers>=4.38.0, I find the bug below: File "/×××/llm-awq/awq/quantize/auto_scale.py", line 134, in _search_module_scale RuntimeError: The expanded size of the tensor (4608) must match the existing size (4096) at non-singleton dimension 3. Target sizes: [65, 32, 512, 4608]. Tensor sizes: [65, 1, 512, 4096] But earlier versions of transformers would not happen this. So I set transformers==4.36.2.

ys-2020 commented 3 months ago

I will merge this PR. cc @Louym @kentang-mit @tonylins @Sakits

vince62s commented 3 months ago

it's fine as a temporary fix but did you find the root issue of this?