ParticleMedia / RAGTruth

Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
https://arxiv.org/abs/2401.00396
MIT License
102 stars 8 forks source link

Question about the external data in rag #7

Closed Lowlowlowlowlowlow closed 2 weeks ago

Lowlowlowlowlowlow commented 1 month ago

I am particularly interested in the external data references mentioned in the paper. Could you please provide more information about the sources of these data references? Additionally, I would appreciate it if you could share your thoughts on the accuracy and reliability of these sources.

thuwyh commented 1 month ago

Sorry, I don't quite understand your question. If you refer to the original data sources, such as MS MARCO, CNN/DM, yelp etc, we provide the references in the paper. We do not check the reliability of these sources, for they are all commonly used by the community. Their qualilties should be ok.

Lowlowlowlowlowlow commented 1 month ago

Yes, your response partially addressed my question. What I am asking about is "original data sources." Are they strictly ensured to be accurate, especially in the context of QA task?

thuwyh commented 1 month ago

Yes, your response partially addressed my question. What I am asking about is "original data sources." Are they strictly ensured to be accurate, especially in the context of QA task?

I think the answer is NO. RAG cannot guarantee the correctness of the answers every time. In this dataset, we are more concerned with the consistency between the answers and the references. Fact-checking can be studied as a separate issue.

michaelcalvinwood commented 1 month ago

RAG cannot guarantee the correctness of the answers every time.

100% accurate RAG is not only achievable, but we've used the RAGTruth Corpus to demonstrate this. We have discovered that if you remove Noun-Phrase Route Collisions from passages, LLMs will respond with accurate answers every single time. For example, we sent GPT 4 and GPT 3.5 Turbo the identical queries and passages for all the Evident Conflict and Subtle Conflict RAGTruth hallucinations. However, we removed the Noun-Phrase Collisions, and every response came back 100% accurate. Thus, the same model that responded with a hallucination returned a 100% accurate response — every single time.

You can view the RAGTruth results here: https://hallucination-analyzer.ragfix.ai/. To understand how to remove Noun-Phrase Route Collisions, kindly see the following video: https://youtu.be/K4Wg6QzPfyI.

We are now testing smaller models such as GPT-4o. So far, they return 100% accurate responses every single time as well. The cause of hallucinations in RAG-based implementations is always caused by Noun-Phrase Route Collisions (as demonstrated in the aforementioned video).

I am very grateful to the RAGTruth team for putting together this corpus. Although it was intended for developing hallucination-detection technology, it provided us a wonderful third-party dataset to demonstrate that the models were always capable of correct responses. Thank you for all the hard work that you have put into this.

thuwyh commented 1 month ago

@michaelcalvinwood interesting work