Teddy-XiongGZ / MedRAG

Code for the MedRAG toolkit
https://teddy-xionggz.github.io/benchmark-medical-rag/
Other
157 stars 25 forks source link

Request code on evaluation #15

Closed PeterGriffinJin closed 1 month ago

PeterGriffinJin commented 1 month ago

Hi Guangzhi,

Thank you for your great work!

Can I request your code on how to calculate the generation accuracy? Do you conduct substring match to find if ground truth answer occurs in the generated answer or do you adopt other evaluation methods, e.g., NLI/classifier/LLM evaluator to calculate the accuracy?

Best, Bowen

Teddy-XiongGZ commented 1 month ago

Hi Bowen,

We did conduct substring match to find the predicted answer. You can find the relevant code here:

Best, Guangzhi