Generation of "claims" - Githubissues

Liyan06 / MiniCheck

MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]

Apache License 2.0

107 stars 11 forks source link

Generation of "claims" #3

Closed yiqingxyq closed 4 months ago

yiqingxyq commented 4 months ago

Hi!

How did you generate the claims in the benchmark? Did you (1) directly prompt the model to generate a claim, (2) first generate model responses and then decompose them into claims, and (3) use other ways?

Thanks!

Liyan06 commented 4 months ago

Hi Yiqing,

The claims and documents from the benchmark are collected from the previous work:

AggreFact-CNN (SOTA set, Tang et al., 2023)
AggreFact-XSum (SOTA set, Tang et al., 2023)
TofuEval-MediaSum (Tang et al., 2024)
TofuEval-MeetingBank (Tang et al., 2024)
Wice (Kamoi et al., 2023)
Reveal (Jacovi et al., 2024)
ClaimVerify (Liu et al., 2023)
FactCheck-GPT (Wang et al., 2023)
ExpertQA (Malaviya et al., 2024)
Lfqa (Chen et al., 2023)

You can find more detailed description of each dataset in Appendix C of our work.

We are planning to include one additional dataset into the benchmark in a week or so. Stay tuned!