quic / aimet-model-zoo

Other
296 stars 53 forks source link

Reproduce the PPL accuracy anomaly of GPT2 W8A8 (PPL=17590.9778) #46

Open FlyingPotatoZ opened 12 months ago

FlyingPotatoZ commented 12 months ago

Use the gpt2 model, and test the quantification accuracy. model download: https://github.com/quic/aimet-model-zoo/releases/download/torch_gpt2/gpt2_wikitext_finetune.tar.gz test data:wikitext-2-raw-v1,

Item | Description -- | -- AIMET | 1.28.0 Linux kernel | 20.04 cuda | 11.6 torch | torch1.13.1-cu116 python | 3.8.10 aimet-zoo-torch | 1.5.0

The accuracy of fp32 is correct, but the accuracy of W8A8 is particularly large. The results are as follows: aimet_zoo_torch/gpt2/evaluators# python gpt2_quanteval.py --model_config gpt2_w8a8 --per_device_eval_batch_size 8 2023-10-19 02:52:23,612 - root - INFO - AIMET 2023-10-19 02:52:39,262 - datasets.builder - WARNING - Reusing dataset wikitext (/root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0) 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 712.27it/s] 2023-10-19 02:52:39,374 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-957e58d88e4ab49c.arrow 2023-10-19 02:52:39,407 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-10932a0976197214.arrow 2023-10-19 02:52:39,440 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-ba370f2b62ba6d71.arrow 2023-10-19 02:52:39,452 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-1252412874756be5.arrow 2023-10-19 02:52:39,464 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-cfab500129fdf76e.arrow 2023-10-19 02:52:39,476 - datasets.arrow_dataset - WARNING - Loading cached processed dataset at /root/.cache/huggingface/datasets/wikitext/wikitext-2-raw-v1/1.0.0/cache-af35feebb8a10af8.arrow orig model fp32 inference loss: 3.320616739840547 , ppl: 27.67741506034785 /usr/local/lib/python3.8/dist-packages/aimet_zoo_torch/gpt2/model/huggingface/baseline_models/gpt2/modeling_gpt2.py:188: TracerWarning: Converting a tensor to a Python float might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs! w = w / (float(v.size(-1)) ** 0.5) 2023-10-19 02:52:51,367 - Quant - INFO - Unsupported op type Squeeze 2023-10-19 02:52:51,368 - Quant - INFO - Unsupported op type Mean 2023-10-19 02:52:51,542 - Quant - INFO - Selecting DefaultOpInstanceConfigGenerator to compute the specialized config. hw_version:default loss: 3.1809085607528687 , ppl: 24.06861141667116 sim_orig model int8 inference loss: 9.775141424384 , ppl: 17590.977796391602 2023-10-19 02:53:10,600 - main - INFO - Original model performances 2023-10-19 02:53:10,601 - main - INFO - =========================== 2023-10-19 02:53:10,601 - main - INFO - Original Model | 32-bit Environment | perplexity : 27.6774 2023-10-19 02:53:10,601 - main - INFO - Original Model | 8-bit Environment | perplexity: 17590.9778

Is there any issues about my usage?