Thank you for your outstanding work.
I have some questions about the results of the WEIGHT ONLY quantization in Table 1 of the paper. For the W3A16 quantization and W3A16g128 quantization in the table, my understanding is that per channel quantization is more fine-grained than per group quantization, so the results of W3A16 quantization should be better than the results of W3A16g128 quantization, but why the results in the table are that all the methods using W3A16 quantization have worse results than those using W3A16g128 quantization. I hope you can answer my question, thank you.
Thank you for your outstanding work. I have some questions about the results of the WEIGHT ONLY quantization in Table 1 of the paper. For the W3A16 quantization and W3A16g128 quantization in the table, my understanding is that per channel quantization is more fine-grained than per group quantization, so the results of W3A16 quantization should be better than the results of W3A16g128 quantization, but why the results in the table are that all the methods using W3A16 quantization have worse results than those using W3A16g128 quantization. I hope you can answer my question, thank you.