gomate-community / rageval

Evaluation tools for Retrieval-augmented Generation (RAG) methods.
Apache License 2.0
81 stars 9 forks source link

update answer nli groundedness metric #60

Closed QianHaosheng closed 4 months ago

QianHaosheng commented 4 months ago

59 In the evaluation of ALCE's ELI5 dataset, claim recall is calculated by decomposing gt_answer into claims, and then using the NLI model to determine whether the answer generated by LLM can contain the claim.