mit-han-lab / lmquant

Apache License 2.0
102 stars 5 forks source link

evaluate accuracy #4

Closed cyLi-Tiger closed 3 months ago

cyLi-Tiger commented 3 months ago

Thanks for your great work!

I use lmquant to generate the checkpoints for Qwen1.5-72b-chat with python -m lmquant.llm.run configs/llm.yaml configs/qoq/gchn.yaml --model-name path_to_my_checkpoints --smooth-xw-alpha 0.3 --smooth-xw-beta 0.7 and use run_e2e.sh in qserve to generate tokens, but the result seems wrong.

Is there a way that I can do end to end inference without qserve and evaluate the accuracy of the w4a8kv4 algorithm alone?

synxlin commented 3 months ago

Hi @cyLi-Tiger , if you would like to evaluate the accuracy alone, you can just use lmquant to evaluate with the fake-quant model. To evaluate with the wikitext 2 perplexity, you can use command

python -m lmquant.llm.run configs/llm.yaml configs/qoq/gchn.yaml --model-name path_to_my_checkpoints --smooth-xw-alpha 0.3 --smooth-xw-beta 0.7

To evaluate with the zero-shot accuracy, you can use command

python -m lmquant.llm.run configs/llm.yaml configs/qoq/gchn.yaml --model-name path_to_my_checkpoints --smooth-xw-alpha 0.3 --smooth-xw-beta 0.7 --eval-evaluator lm_eval --eval-tasks zero-shot

which will automatically add wikitext, hellaswag, piqa, winogrande, arc_easy, arc_challenge to the evaluation tasks using lm_eval.

cyLi-Tiger commented 3 months ago

Thanks for your prompt reply! @synxlin

Take ppl evaluation as an example in QoQ, here, kv cache is unused, right? The above scripts you provide can't evaluate the accuracy under kv4. Please correct me if I miss something in your code, thank you again!

cyLi-Tiger commented 3 months ago

Another question, per-group a8w4 with progressive quantization has scales from int8 to int4. what if I want to use per-channel a8w4? How to dequantize weight from int4 to int8 before gemm, cause we convert weight from bf16 to int4 directly and don't have such scale between int4 and int8.

synxlin commented 3 months ago

Hi @cyLi-Tiger.

cyLi-Tiger commented 3 months ago

All my questions are well answered, thanks!