Questions about quantization - Githubissues

OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

MIT License

730 stars 56 forks source link

Questions about quantization #81

Open mxjmtxrm opened 5 months ago

mxjmtxrm commented 5 months ago

Hi, great work! I met some problems during 4bit weight-only quantization(--lwc).

Is there any problem if the norm is nan?
what's the best lwc hyper-parameter of LLama2 with different scales? like lwc-lr and epochs?
Does more calib data bring better results?

I quantized a llama model using different lwc hyper-parameters and received different results.

nsamples=1000, batch_size=1, epoch=2, the ppl is correct.
nsamples=2000, batch_size=8, epoch=10, the ppl is super large (40000+). What's the problem?

SherrySwift commented 2 weeks ago

I found NaN norm during training, too. I guess it is caused by AMP training.