OpenGVLab / OmniQuant

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
MIT License
700 stars 54 forks source link

Obtained different PPL for Wikitext and C4 compared to results reported in the paper #95

Open yc2367 opened 1 month ago

yc2367 commented 1 month ago

Hi, thank you so much for the amazing paper and repo.

I am trying to reproduce the Wikitext and C4 perplexity in the OmniQuant paper. I downloaded the repo and run the following experiment:

CUDA_VISIBLE_DEVICES=0 python main.py --model meta-llama/Llama-2-7b-hf --epochs 20 --output_dir ./log/llama-7b-w3a16g128 --eval_ppl --wbits 3 --abits 16 --group_size 128 --lwc

As shown in the paper, the ppl for Wikitext and C4 for Llama-2-7B at w3g128 should be 6.03 and 7.75, respectively. But I obtained the following results from the log.

[2024-09-12 03:20:31 root] (main.py 144): INFO wikitext2 : 6.098666191101074 [2024-09-12 03:23:30 root] (main.py 144): INFO c4 : 7.8100385665893555

Did I set the hyperparameters wrongly? Hope you could help me clarify, thanks!

ChenMnZ commented 1 week ago

The checkpoints have some mismatch with current code.

Retrain by yourself through current code can successfully reproduce the results.

yc2367 commented 1 week ago

Thank you for the quick response!