Closed zthang closed 1 year ago
Hi @zthang , thanks for your interest in our work
To the best of my knowledge, I haven't seen another dataset where LLM's outputs are annotated (either at sentence or token levels).
There is one dataset about hallucination detection: HaDes from Microsoft, but the texts that they annotated were obtained by corrupting/perturbing factual texts. This is not LLM's hallucination, so it's not really applicable in our work (we are trying to detect when the LLM is uncertain during generation, and whether this uncertainty would lead to hallucination by doing self-checking). Nevertheless, I thought the HaDes dataset could be something relevant for you.
Best wishes, Potsawee
okay, many thanks!
Hello! Will there be v4 in the future? Thanks~
Hi @zthang ,
There is currently no plan about extending the annotation on GPT-3 generated WikiBio passages.
Best, Potsawee
Good job! I also wonder if you tried to use some other datasets to evaluate the proposed method? Or are there some other datasets like wiki_bio_gpt3_hallucination that I can test on? Thanks!