Open grigorn opened 10 months ago
Hi. Can I check which LLaMa-7B checkpoint you use? decapoda-research/llama-7b-hf
in my code is not available currently and I'm not sure if it is the reason that causes this difference.
I am using 'yahma/llama-7b-hf'
Have you tried the copied version of decapoda-research/llama-7b-hf
, e.g., https://huggingface.co/baffo32/decapoda-research-llama-7B-hf?
We would try that kind of checkpoint these days to see if the results are reproducible in those available checkpoints.
With the checkpoint you specified, I could replicate the metrics. Do you know what is the difference between those 2? I thought there is one LLama and the checkpoints should be the same
I have no idea about this😢. I guess the possible reasons may be: (1) the EOS token issue or (2) the weight between these two is slightly different.
I checked both the model and the tokenizer. Model weights and tokenizer.get_vocab() are the same, but there is the difference of special tokens - for baffo32 all three special tokens are empty strings. Can this be reason of these differences? If yes, do you know which one is the "true" LLama?
I run LLM pruner with the command specified in the ReadMe to prune LLama-7B
I get the following results
Perplexities reported in Table 1 in the paper are WikiText2 - 19.09 and PTB - 34.21. Is there any reason for the difference in thses perplexities especially PTB? Thanks