OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
626 stars 49 forks source link

Some questions about the results of weight only quantification in the paper #78

Closed everloom closed 2 months ago

everloom commented 2 months ago

Thank you for your outstanding work. I have some questions about the results of the WEIGHT ONLY quantization in Table 1 of the paper. For the W3A16 quantization and W3A16g128 quantization in the table, my understanding is that per channel quantization is more fine-grained than per group quantization, so the results of W3A16 quantization should be better than the results of W3A16g128 quantization, but why the results in the table are that all the methods using W3A16 quantization have worse results than those using W3A16g128 quantization. I hope you can answer my question, thank you.