Failed to reproduce the LLama7B perplexity on the Penn Treebank (PTB) dataset.

hopef commented 8 months ago

Thank you for your excellent work. Why am I unable to reproduce your perplexity metrics on Penn Treebank (PTB) ?
In OWQ Paper: 12.46 Our reproduce: 56.033756256103516

Here is the output after I execute the "python main.py huggyllama/llama-7b c4 --wbits 3 --target_bit 3.01" command. Thank you and looking forward to your reply.

wikitext2
Evaluating ...
6.676162242889404
Token indices sequence length is longer than the specified maximum sequence length for this model (106527 > 2048). Running this sequence through the model will result in indexing errors
ptb
Evaluating ...
56.033756256103516
Generating validation split: 45576 examples [00:00, 212472.03 examples/s]
Token indices sequence length is longer than the specified maximum sequence length for this model (612151 > 2048). Running this sequence through the model will result in indexing errors
c4
Evaluating ...
8.551858901977539

jinjungyu commented 8 months ago

Hi, thank you for your interest in our work! We update the code to address the issue of high PPL for PTB datasets in the LLaMA model. Could you pull the main branch and try again? Thank you :)

hopef commented 8 months ago

Ohh, I got the correct ppl through your last commit. Thanks a lot.

xvyaward / owq

Failed to reproduce the LLama7B perplexity on the Penn Treebank (PTB) dataset. #1