Perplexity for quantized models

huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

https://huggingface.co/docs/evaluate

Apache License 2.0

2.03k stars 258 forks source link

Perplexity for quantized models #643

Open Hongao0611 opened 3 weeks ago

Hongao0611 commented 3 weeks ago

Hi, If I intend to compute the perplexity for a quantized model, what parameters should I pass to the .compute() function?

For example,

perplexity = load("perplexity", module_type="metric") good_result = perplexity.compute(predictions=SENTENCES,model_id=MODEL_NAME,add_start_token=False)

Much thx!

Hongao0611 commented 3 weeks ago

I tried loading a quantized model, but it failed:

ValueError: .to is not supported for 4-bit or 8-bit bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct dtype.