Closed NewDriverLee closed 2 weeks ago
Hi, I found the same issue. I am wondering whether the Llama-2-13b-w4a4.pth is destroyed?
Hi, I also found the issue. Have you guys solved this problem before? :blush:
@linloong @FelixMessi @NewDriverLee
Sorry for the late response.
The Llama-2-13B W4A4 checkpoint is destroyed due to some instability.
We have retrain the Llama-2-13B on the latest code and update the checkpoint on (huggingface)[https://huggingface.co/ChenMnZ/OmniQuant/blob/main/Llama-2-13b-w4a4.pth].
The results you can obtain from this checkpoint is:
@ChenMnZ
Thanks for your reply!
I also train the Llama-2-13B by myself, and the results is below:
which is close to the results 12.3 reported in the paper.
And your results are better.
By the way~ When I ran the code, I found that the version of the datasets library was very problematic. It seems that version 2.20.0 is required to download the wikitext2 dataset, however, version 2.0.0 is required for the C4 dataset. Please, what is the datasets library version you have chosen, or is there any other way to avoid this problem?
Hi, when I tried to reproduce the evaluation results for Llama-2-13b w4a4, I got "nan" for both WIKI and C4. However, the reproduction results are good for Llama-2-13b w6a6 and Llama-2-7b w4a4. I guess the experimental settings are OK.
I noticed that you have updated the pretrained OmniQuant parameters of Llama-2-13b w4a4 about 6 months ago.
My script is:
What do you think could be the possible reasons causing this issue?