Closed dmitrysarov closed 5 months ago
Hi! The code you pointed out is in fact Bradley-Terry where we perform the reweighted maximum likelihood estimation. The code is essentially the same as the one used to compute the ranking for Chatbot Arena. Full detail of the math can be found in our Chatbot Arena paper's math section. https://arxiv.org/pdf/2403.04132
First of all thanx for your work. Maybe I have misunderstood, but I could not file Bradley-Terry model usage/implementation in your code, instead you are doing something interesting with LogReg coefficient. Please, can you point to the source of idea behind? And do you think that Bradley-Terry model will perform worse than this LR trick?