for fc in linears2scale:
fc.weight.mul_(scales.view(1, -1).to(fc.weight.device))
fc.weight.data = w_quantize_func(fc.weight.data) / (scales.view(1, -1))
The FC weights are updated for each scale used in the grid search. But shouldn't the weight be reset to the original values for the next iteration? Otherwise, wouldn't the scale value be compounded?
In the function
_search_module_scale
. https://github.com/mit-han-lab/llm-awq/blob/79019832efd37e4c24a695442880190858aa605e/awq/quantize/auto_scale.py#L131 )The FC weights are updated for each scale used in the grid search. But shouldn't the weight be reset to the original values for the next iteration? Otherwise, wouldn't the scale value be compounded?
Or Am I not observing something here?