vectara / hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
https://vectara.com
Apache License 2.0
1.25k stars 50 forks source link

Any arxiv paper or report? #24

Closed zhimin-z closed 10 months ago

amin2718 commented 10 months ago

We have a detailed write-up on the methodology employed here: Cut the Bull…. Detecting Hallucinations in Large Language Models. Simon, the author, was also the lead researcher who carried out development of the model.

TomLucidor commented 1 month ago

Here is the question: How do you manage Qwen or Llama or DeepSeek or Gemma to halloucinate less and be on par with API models?

P.S. GLM is what CodeGeeX is based on, which is very surprising considering it has lower risk of halloucination.