Open athewsey opened 1 month ago
hey @athewsey, thanks for the kind works and that is infact a very good features request. The reason why we dropped it was because we thought you can easy split the ground truths into multiple rows and then compute it. Have you tried that approach?
The con with that approach is that now its on the users side to process the different ground truths for the same question, answer pairs. I can get on a call and set it up for you if you want and then we can add it back as a doc update.what do you think of that?
Describe the Feature
Per the deprecation message I receive in v0.1.7:
...It seems like providing multiple alternative ground-truth answers to a question used to be supported but is being removed?
I couldn't quite figure out the situation on this from the docs or issues (maybe I just missed something somewhere?), but I'd like for Ragas to support multiple reference answers per question, if possible.
Why is the feature important for you?
I have a document-based question answering dataset where we've seen a few cases of questions which could have multiple correct answers - sometimes very similar, but in other cases quite semantically distinct. We don't necessarily have to score the alternative GT answers separately, but it seems like we need to clarify to the evaluator LLM that either of the options are equally valid, to receive reliable judgments... And it's not clear to me how to do that going forward, if
ground_truth
needs to become a single string?Additional context
Thanks for your great work on the library! Really appreciate having an evaluation tool that maintains some separation from the actual orchestration of the chain/RAG.