Closed gotutiyan closed 1 year ago
Hi, thank you so much to find out the issue! I figured out that for models such as gpt-2 and opt, they don't have a padding token as a default, so I added one at loading the models (https://github.com/asahi417/lmppl/blob/main/lmppl/ppl_recurrent_lm.py#L70). If a new padding token was added in a post-hoc manner, the logit on the newly added padding token became high, and that resulted in a explosive perplexity in the end. I fixed it by disregarding the newly added padding token at computing negative log likelihood, and now it produces reliable scores.
I also double checked the perplexity given by the huggingface introduction https://huggingface.co/docs/transformers/perplexity and confirmed that the one from lmppl matched to those produced with the introduction.
Hi, thank you for your developing lmppl.
I have a question about too large perplexity.
I installed lmppl and execute the commands described in the README as follows, but
get_perplexity()
returns quite large value. Is there something wrong with the procedure?Version of some modules in my environment:
Thank you.