casper-hansen / AutoAWQ

AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:
https://casper-hansen.github.io/AutoAWQ/
MIT License
1.75k stars 208 forks source link

use autoawq quantizing Qwen2-72B-Instruct error #577

Open ving666 opened 3 months ago

ving666 commented 3 months ago

File "/home/qx/.local/lib/python3.10/site-packages/awq/models/base.py", line 231, in quantize self.quantizer.quantize() File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 166, in quantize scales_list = [ File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 167, in self._search_best_scale(self.modules[i], *layer) File "/home/qx/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 330, in _search_best_scale best_scales = self._compute_best_scale( File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 391, in _compute_best_scale self.pseudo_quantize_tensor(fc.weight.data)[0] / scales_view File "/home/qx/.local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 79, in pseudo_quantize_tensor assert torch.isnan(w).sum() == 0 AssertionError

casper-hansen commented 3 months ago

Can you provide more details and your setup and the code used?

ving666 commented 3 months ago

Can you provide more details and your setup and the code used?

Releases:V0.2.6 my code: 1

Soulscb commented 2 months ago

do you solve your problem?