vectara / hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
https://vectara.com
Apache License 2.0
1.25k stars 50 forks source link

Claude 2.1 Benchmark Missing #16

Closed lukestanley closed 11 months ago

lukestanley commented 11 months ago

It's surprising that Llama 2 beats Claude 2.0 but anyway, how does 2.1 perform?

mbae26 commented 11 months ago

More LLMs will soon be added to our leaderboard. Please stay tuned for the upcoming additions :)