Open langxinspieder opened 4 weeks ago
Hi, I got a similar issue when generating real quantized weights (w3).
File "/home/username/llm-awq/awq/quantize/qmodule.py", line 83, in __init__
raise NotImplementedError("Only 4-bit are supported for now.")
Any solutions?
Very good job! I have encountered a problem: I can quantify according to a w_bit 4,128 group, but is there a result of quantifying according to a w_bit 3, 128 group in the article, or is the result in the article based on the simulated PPL index? I ran the instruction that could be quantized according to 4 bits, but modified w_bit 4 by w_bit 3, which resulted in an error, as shown in the figure What should I do?