explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.18k stars 730 forks source link

[Question] why context is used in question generation prompt used in answer relevancy calculation #440

Closed praveenck06 closed 5 months ago

praveenck06 commented 10 months ago

Hello,

I've been exploring the methodology for calculating answer relevancy and noticed that the context is provided alongside the answer during the evaluation process.

As I understand it, the approach involves generating questions from the given answer using a LLM and then comparing these generated questions with the original question to assess relevancy.

My question pertains to the necessity and impact of including context in this process. What is the rationale behind using context in the relevancy calculation, and how does it enhance the results?

To further investigate, I conducted an experiment where I modified the prompt to instruct the language model to disregard the provided context. The idea was to evaluate if excluding context might be a more effective approach. Unfortunately, RAGAS, doesn't currently allow for the exclusion of context without altering the prompt itself.

Here are the results from my experiment on the FIQA dataset: image

From these findings, it appears that the questions generated without considering context are more closely aligned with the original questions than those generated with context. This raises the question: Is providing context to the language model truly beneficial for the purpose of calculating answer relevancy?

I would appreciate any insights or explanations regarding the use of context in this process. Also, if there is any possibility to exclude context without modifying the prompt in future iterations of RAGAS, that information would be valuable.

Thank you for your time and assistance.

shahules786 commented 10 months ago

Wow, your visualization looks great @praveenck06 . thanks for sharing your analysis, your results indeed has high correlation. The rationale behind using context in question generation is to ensure that it works for situations like keyphrase extraction. For example question: Extract an emotional keyphrases from the given context answer: keyphrase1, keyphrase2 context: context containing keyphrase1 and keyphrase2

In situations similar to this question generation can benefit from context. Does that make sense?

praveenck06 commented 10 months ago

@shahules786 since we have already seperate metric for relevancy of context, my worry is that if we pass the context for question generation then answer relevancy would be combination of answer generation and context retrival. would it better to have seperate metric for answer generation and context retrival ?

And also need help on understanding noncommital part of the calculation

shahules786 commented 10 months ago

@praveenck06 the noncommittal part is used to identify answers from LLM like "I don't know about x", "My knowledge cutoff is y", etc. These answers should have zero relevancy.

shahules786 commented 10 months ago

Hi @praveenck06 feel free to close the issue if your concern was addressed.