SocialComplexityLab / life2vec

MIT License
479 stars 67 forks source link

Perplexity Calculation #8

Closed LucFumag closed 1 month ago

LucFumag commented 1 month ago

Hello,

I noticed that in the code, perplexity is calculated as the square root of the loss. As far as I know, it can be calculated using the exponential transformation of the loss.

Can you explain why the metrics are calculated this way and provide any source materials that explain this approach?

Thank you in advance for your attention and answers.

carlomarxdk commented 1 month ago

Hey there, Thanks for the question!

We (initially) used the exponential transform of the loss (to record perplexity), but when we had to perform the hyperparameter search, we switched to the square root. Since our hyperparameter choice depends on two losses (one from the MLM and one from the SOP tasks), we wanted them to be on the same order of magnitude (and the square root transform seemed to do the trick).

Subsequently, I forgot to rename the metric we are tracking (and kept it as perplexity). You are 100% correct that perplexity is the exponential transformation of the cross-entry loss.

Sorry for the confusion.

LucFumag commented 1 month ago

All clear, thank you very much!