Open Amshaker opened 1 year ago
Hi, Thank you for reporting the problem. It is pretty strange. PTB 27 ppl is way to high and the results for wikitext2 and c4 look reasonable( based on c4 and wiki2 we guess you are running configuration for llama-7b with average bit 4.63).
Could you please provide details of command that you used to launch quantization( input parameters configuration) ? Also it would be great to have information, about your setup: version of transformers and torch libraries, cuda version and your gpu?
Thank you for your reply. Yes, I am running the configuration of llama-7b.
This is the command I used for running:- python main.py $MODEL_PATH custom \ --load_from_saved=$PAJAMAS_PATH \ --wbits 4 \ --groupsize 16 \ --perchannel \ --qq_scale_bits 3 \ --qq_zero_bits 3 \ --qq_groupsize 16 \ --outlier_threshold=0.2 \ --permutation_order act_order \ --percdamp 1e0 \ --nsamples 128
The GPU is A100 40GB, pytorch==1.12.1, cudatoolkit==11.3.1, transformers==4.30.2.
I also meet this problem.
Hey, @Amshaker, thank you for providing your setup. I attempted to replicate the unusual outcome observed in the ptb using the parameters you provided, but on my environment. However, I obtained results close to 9 ppl, which align with the findings in the paper. To ensure the accuracy of my assessment, I ask my colleagues to independently executed the script in their own environments. They also obtained the same results as me (please see the attachment).
I have a suspicion that the issue with the ptb evaluation stems from outdated versions of some of libraries in your setup. Specifically, I believe that your torch version might be too old, causing the problem(but this is just guess). Could you please update your setup to include the library versions mentioned in the requirement.txt file and then rerun the script? Kindly share your findings to determine whether the problem persists or if updating the libraries resolves it.
My setup torch 1.13.1 and transformers 4.28.dev(as in requirement.txt), datasets 2.10.1, cuda-toolkit 11.6.1 I run it on A100 80GB , cuda 11.6 my colleagues setup: torch 1.13.1 , transformers 4.29.2 , datasets 2.10.1
Hi, Thank you for sharing your work. I also meet this problem. My setup: torch 1.13.1, transfomers 4.92.2, datasets 2.10.1, 3090 GPU
Hey, Sorry for late response. Could you please specify which LLAMA model you are using? Is it the deecapoda or huggylama, or perhaps another variant? Does issue you mentioned limited to the PTB dataset or it affects the results on Wiki2 and C4 as well(are they the same as in paper)? It appears that there might be a problem with the tokenization process, and a potential fix for the deecapoda model is proposed in this pull request: PR #20.
Hi, Thank you for sharing your work. I also meet this problem. My setup: torch 1.13.1, transfomers 4.92.2, datasets 2.10.1, 3090 GPU
Encountered the perplexity problem on PTB using huggylama.
torch==1.13.1, transformers==4.29.2, datasets>=2.10.1, NVIDIA L4 GPU
I encountered the same ppt problem on PTB. Any update how to resolve it?
I encounterd the same problem too... How to solve it
Hi,
Thank you for sharing your work.
The re-produced perplexity for ptb dataset using your code is not matched with the paper. The reproduced is 27.8, while in the paper it is around 9. Please clarify that.