asahi417 / lmppl

Calculate perplexity on a text with pre-trained language models. Support MLM (eg. DeBERTa), recurrent LM (eg. GPT3), and encoder-decoder LM (eg. Flan-T5).
MIT License
134 stars 11 forks source link

Handling of long input #4

Closed UntotaufUrlaub closed 1 year ago

UntotaufUrlaub commented 1 year ago

Hi,

how is input text handled which is longer than allowed for the selected LM, is it truncated or is a sliding window applied?

kind regards

asahi417 commented 1 year ago

Sorry but I didn't check the last part of the blog. To be precise, we don't use the sliding window. The overflow token will be disregard.

asahi417 commented 1 year ago

This actually looks interesting. If you have any resource (paper or blog), verifying the use of the sliding window such as testing on some benchmarks, please share it with us.

UntotaufUrlaub commented 1 year ago

Thanks for the information! I got no resources, the explanation of huggingface just seems reasonable to me. Maybe consider adding it as optional for now, so the user can decide whether it is justified.

kind regards