OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
626 stars 49 forks source link

Questions about quantization #82

Closed mxjmtxrm closed 1 month ago

mxjmtxrm commented 1 month ago

Hi, great work! I met some problems during 4bit weight-only quantization(--lwc).

  1. Is there any problem if the norm is nan?
  2. what's the best lwc hyper-parameter of LLama2 with different scales? like lwc-lr and epochs?
  3. Does more calib data bring better results?