IINemo / lm-polygraph

MIT License
111 stars 21 forks source link

Entropy calculation maybe wrong? #195

Closed athrvkk closed 4 months ago

athrvkk commented 4 months ago

Hello,

I am not 100% sure, but I believe the entropy calculation is wrong here: https://github.com/IINemo/lm-polygraph/blob/main/src/lm_polygraph/stat_calculators/entropy.py

On line 43, shouldn't you compute the sum instead of the mean?

Also, the entropy should be calculated with base 2. The log probabilities (logprobs) returned by the HuggingFace language models typically use the natural logarithm (base e).

IINemo commented 4 months ago

Thank you for noticing this! We will fix it asap. However, we note that this mistake does not affect the evaluation results.

athrvkk commented 4 months ago

Thanks for your response @IINemo ! I am not entirely sure how this error would not affect the final results?

IINemo commented 4 months ago

The results in the metrics depend on the ordering of the uncertainty scores. Since models have a fixed vocabulary, averaging (i.e. multiplying by a constant) would not affect the final ordering and the final metric scores. Correct me if you note something else.

athrvkk commented 4 months ago

Oh you mean the final metrics (PRR), then it's correct. Thanks for the clarification!

IINemo commented 4 months ago

@rvashurin could you check, please?

IINemo commented 4 months ago

Should be fixed now.