OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
689 stars 53 forks source link

Update omniquant.py #36

Closed brisker closed 10 months ago

brisker commented 10 months ago

During test right after training, scales are float32, so scale merging process is in fp32, but in resume-test, scales and scale merging process are both float16, this difference will causes slight acc difference. So we need to set the scales into float16 instantly after training finishes, before merging scales into weights

I have validate this in one case, and the acc between "test right after training" and "resume-test" is exactly the same.

ChenMnZ commented 10 months ago

Good work, thanks for your contribution.