ParticleMedia / RAGTruth

Github repository for "RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models"
https://arxiv.org/abs/2401.00396
MIT License
106 stars 8 forks source link

Incorrect Labels #2

Open michaelcalvinwood opened 6 months ago

michaelcalvinwood commented 6 months ago

First, thank you for the effort put into RAGTruth. There is a tremendous need for such a dataset.

Unfortunately, some of the labels are sorely inaccurate. Consider Response ID 11898 as one example. This response states three supposed hallucinations, all with implicit_true being false.

Consider the first:

In other words, the provided passage does state that there is a potential for those with graduate degrees to earn more than their undergraduate counterparts; which means that there is a potential for undergrads to earn less than those with graduate degrees. Hence, the annotation is incorrect.

Consider the second:

Yet, "fresh out of grad school" is equivalent to "upon graduation." And the whole context is "earning a higher income" ("making a lot more than their undergraduate counterparts"). Hence, the annotation is incorrect.

Finally, consider the third:

Hence, this annotation is correct.

Naturally, the value of the dataset is directly proportional to the correctness of the annotations. While I recognize the immense effort that has gone into this dataset, there's still a need for additional annotators to fix errant labels (and there are a lot of errant labels).

Kindly consider fixing the errant labels to make RAGTruth the incredible resource that it can be.

sgfuiwshlkahr commented 5 months ago

Hello Michael, thank you for your detailed review. We acknowledge that the examples you pointed out did not meet the expected standards of accuracy, and we appreciate that you brought them to our attention. We want to highlight that we are committed to the quality of the dataset and the version presented was the outcome developed through multiple rounds of review. Due to the size, it was challenging in maintaining uniform accuracy among all annotators across all annotations. However, we will be conducting another round of thorough review, aiming to have the dataset reflect its true intent and utility in supporting the value of our research.

michaelcalvinwood commented 5 months ago

Thank you.

Now that I know that you are committed to this dataset, I'll gladly add examples here when I come across them in order to help out.

There truly is a great need for an accurate hallucination corpus. :-)

ogencoglu commented 2 months ago

Hello Michael, thank you for your detailed review. We acknowledge that the examples you pointed out did not meet the expected standards of accuracy, and we appreciate that you brought them to our attention. We want to highlight that we are committed to the quality of the dataset and the version presented was the outcome developed through multiple rounds of review. Due to the size, it was challenging in maintaining uniform accuracy among all annotators across all annotations. However, we will be conducting another round of thorough review, aiming to have the dataset reflect its true intent and utility in supporting the value of our research.

Sounds LLM-generated to be honest.

Any updates?