[ptb perplexity is different from paper]

Amshaker commented 1 year ago

Hi,

Thank you for sharing your work.

The re-produced perplexity for ptb dataset using your code is not matched with the paper. The reproduced is 27.8, while in the paper it is around 9. Please clarify that.

Vahe1994 commented 1 year ago

Hi, Thank you for reporting the problem. It is pretty strange. PTB 27 ppl is way to high and the results for wikitext2 and c4 look reasonable( based on c4 and wiki2 we guess you are running configuration for llama-7b with average bit 4.63).

Could you please provide details of command that you used to launch quantization( input parameters configuration) ? Also it would be great to have information, about your setup: version of transformers and torch libraries, cuda version and your gpu?

Amshaker commented 1 year ago

Thank you for your reply. Yes, I am running the configuration of llama-7b.

This is the command I used for running:- python main.py $MODEL_PATH custom \ --load_from_saved=$PAJAMAS_PATH \ --wbits 4 \ --groupsize 16 \ --perchannel \ --qq_scale_bits 3 \ --qq_zero_bits 3 \ --qq_groupsize 16 \ --outlier_threshold=0.2 \ --permutation_order act_order \ --percdamp 1e0 \ --nsamples 128

The GPU is A100 40GB, pytorch==1.12.1, cudatoolkit==11.3.1, transformers==4.30.2.

ChenMnZ commented 1 year ago

I also meet this problem.

Vahe1994 commented 1 year ago

Hey, @Amshaker, thank you for providing your setup. I attempted to replicate the unusual outcome observed in the ptb using the parameters you provided, but on my environment. However, I obtained results close to 9 ppl, which align with the findings in the paper. To ensure the accuracy of my assessment, I ask my colleagues to independently executed the script in their own environments. They also obtained the same results as me (please see the attachment).

I have a suspicion that the issue with the ptb evaluation stems from outdated versions of some of libraries in your setup. Specifically, I believe that your torch version might be too old, causing the problem(but this is just guess). Could you please update your setup to include the library versions mentioned in the requirement.txt file and then rerun the script? Kindly share your findings to determine whether the problem persists or if updating the libraries resolves it.

My setup torch 1.13.1 and transformers 4.28.dev(as in requirement.txt), datasets 2.10.1, cuda-toolkit 11.6.1 I run it on A100 80GB , cuda 11.6 my colleagues setup: torch 1.13.1 , transformers 4.29.2 , datasets 2.10.1

Magic-lem commented 1 year ago

Hi, Thank you for sharing your work. I also meet this problem. My setup: torch 1.13.1, transfomers 4.92.2, datasets 2.10.1, 3090 GPU

Vahe1994 commented 1 year ago

Hey, Sorry for late response. Could you please specify which LLAMA model you are using? Is it the deecapoda or huggylama, or perhaps another variant? Does issue you mentioned limited to the PTB dataset or it affects the results on Wiki2 and C4 as well(are they the same as in paper)? It appears that there might be a problem with the tokenization process, and a potential fix for the deecapoda model is proposed in this pull request: PR #20.

Hi, Thank you for sharing your work. I also meet this problem. My setup: torch 1.13.1, transfomers 4.92.2, datasets 2.10.1, 3090 GPU

DavidePaglieri commented 11 months ago

Encountered the perplexity problem on PTB using huggylama.

torch==1.13.1, transformers==4.29.2, datasets>=2.10.1, NVIDIA L4 GPU

brianchmiel commented 9 months ago

I encountered the same ppt problem on PTB. Any update how to resolve it?

sxbkdby8 commented 2 months ago

I encounterd the same problem too... How to solve it

Vahe1994 / SpQR

[ptb perplexity is different from paper] #16