OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
663 stars 50 forks source link

Enforce minimum CLIPMIN value for the scale. #33

Closed radi-cho closed 9 months ago

radi-cho commented 9 months ago

This is minor, but it causes nan values in my experiments.

As far as I understand, the CLIPMIN bound in quantizer.py is introduced to prevent the scale from being too small and causing division by zero. However, after the scale is clamped in the right range, it is immediately set back to the original value. In the current implementation, it makes the most sense to remove line 145 and just set CLIPMIN to 0 if that is desired in any given case.

31 might also be related.

ChenMnZ commented 9 months ago

Thanks for your proposal.