details of the Elo rating algorithm

OpenGVLab / Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

419 stars 30 forks source link

details of the Elo rating algorithm #6

Open zfj1998 opened 1 year ago

zfj1998 commented 1 year ago

Nice work! Interested in the design of 1 vs 1 battles between LVLMs, but can you share more details about the Elo rating algorithm? Like the choice of k-factor, the expected confidence intervals with the collected user ratings, etc. Appreciated if you can share more of the details.