yanolja / arena

Apache License 2.0
3 stars 1 forks source link

Draw the results in matrix format #127

Open hist0613 opened 3 months ago

hist0613 commented 3 months ago
image

Arena is currently providing the ELO rating only. For a better visibility, it would be good to see the results in Figure 3 of the LMSYS Chatbot Arena Leaderboard.

hist0613 commented 3 months ago

We could provide the multi-agent evaluation results in this format. (which enables us to compare AI evaluation with human evaluation)