Any arxiv paper or report?

vectara / hallucination-leaderboard

Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents

https://vectara.com

Apache License 2.0

1.25k stars 50 forks source link

Closed zhimin-z closed 10 months ago

amin2718 commented 10 months ago

We have a detailed write-up on the methodology employed here: Cut the Bull…. Detecting Hallucinations in Large Language Models. Simon, the author, was also the lead researcher who carried out development of the model.

TomLucidor commented 1 month ago

Here is the question: How do you manage Qwen or Llama or DeepSeek or Gemma to halloucinate less and be on par with API models?

P.S. GLM is what CodeGeeX is based on, which is very surprising considering it has lower risk of halloucination.