What is the purpose of returned context when using synthetic data generation?

explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀

https://docs.ragas.io

Apache License 2.0

7.31k stars 745 forks source link

What is the purpose of returned context when using synthetic data generation? #1471

Open diegodebrito opened 1 month ago

diegodebrito commented 1 month ago

[X] I checked the documentation and related resources and couldn't find an answer to my question.

Your Question what is unclear to you? What would you like to know?

When generating synthetic test data, RAGAS also returns the pieces of context that were used to generate the QA pairs. When evaluating a RAG system, I will be pulling other pieces of context/chunks and using those for evaluation. The chunks used are not the ones generated by the synthetic data generator. What is the purpose of the provided context chunks? Just to double check things?

Code Examples This community speaks code. Share your code snippets to help us understand your question better.

Additional context Anything else you want to share with us?

dosubot[bot] commented 1 month ago

Olá @diegodebrito! Como posso te ajudar hoje?

The purpose of the returned context when using synthetic data generation in RAGAS is to form an evaluation dataset. This dataset includes a question, generated answer, retrieved context, and ground-truth answer. Although the context chunks are not directly used for evaluation, they are part of the dataset structure that RAGAS uses to evaluate the LLM application data comprehensively. This structure allows for a thorough assessment of the RAG pipeline ^[1].

To continue talking to Dosu, mention @dosu.

diegodebrito commented 1 month ago

@dosu That is exactly what I was asking. If it's not used on evaluations, why is it returned? Is there any use case for those pieces of contexts?

dosubot[bot] commented 1 month ago

The returned context chunks in RAGAS are used to ensure comprehensive coverage of various components within the pipeline. They include questions with different characteristics like reasoning, conditioning, and multi-context, crafted from the provided documents. These chunks help in understanding how well the retrieval mechanism works, even if they aren't directly used in evaluations. They can be useful for verifying the retrieval process and ensuring that the synthetic data generation covers diverse scenarios ^[1].

To continue talking to Dosu, mention @dosu.

diegodebrito commented 1 month ago

Can we leave this open so someone has a chance to comment with a use case or confirmation on that? I understand your point, @dosu, I'm just trying to get some information that might not be available on your sources of responses.

dosubot[bot] commented 1 month ago

This is a tough nut to crack, could really use your help on this, @jjmachan.

To continue talking to Dosu, mention @dosu.

MariosKef commented 2 weeks ago

Is there any update on this @jjmachan ? I am also confused by this, as are several of my colleagues. Since the evaluation metrics underline the term "retrieved contexts", it might be beneficial to clarify this, as for the time being, the generated contexts seem redundant.

MariosKef commented 1 week ago

Hi there, any update on this?