Closed gcy0926 closed 3 months ago
@gcy0926 Hello there! I'm a bot designed to help you with bugs, questions, and contributions. I'm here to assist you while you wait for a human maintainer.
Yes, you should index the documents and renew the context retrieved by your model when using the context recall metric and other retrieval metrics on your own embedding model [1].
Yes, there are questions in the "explodinggradients/fiqa - ragas_eval" dataset whose answers depend on the context. Therefore, if you renew the context, you should also renew the ground truth [1].
To continue talking to Dosu, mention @dosu.
@gcy0926 Hello there! I'm a bot designed to help you with bugs, questions, and contributions. I'm here to assist you while you wait for a human maintainer.
- Yes, you should index the documents and renew the context retrieved by your model when using the context recall metric and other retrieval metrics on your own embedding model [1].
- Yes, there are questions in the "explodinggradients/fiqa - ragas_eval" dataset whose answers depend on the context. Therefore, if you renew the context, you should also renew the ground truth [1].
To continue talking to Dosu, mention @dosu.
what's your suggestion on how to renew ground truth answer? using a larger LLM?
@gcy0926 Hello there! I'm a bot designed to help you with bugs, questions, and contributions. I'm here to assist you while you wait for a human maintainer.
- Yes, you should index the documents and renew the context retrieved by your model when using the context recall metric and other retrieval metrics on your own embedding model [1].
- Yes, there are questions in the "explodinggradients/fiqa - ragas_eval" dataset whose answers depend on the context. Therefore, if you renew the context, you should also renew the ground truth [1].
To continue talking to Dosu, mention @dosu.
And another question, is the source doc for explodinggradients/fiqa ragas_eval in the corpus subset? which is 57.6k rows.
To renew ground truth answers, using a larger LLM is indeed a good approach. The ragas
framework supports this by allowing you to replace the default OpenAI models with custom LLMs and embeddings. You can pass custom LLMs and embeddings when instantiating the TestsetGenerator
and use them in the evaluation process. Here’s an example:
from ragas.testset.generator import TestsetGenerator
# Initialize custom LLM and embeddings
from langchain_together import Together
from langchain_together.embeddings import TogetherEmbeddings
together_key = "<your-key-here>"
embeddings = TogetherEmbeddings(model="togethercomputer/m2-bert-80M-8k-retrieval")
together_completion = Together(
model="NousResearch/Nous-Hermes-2-Mixtral-8x7B-SFT",
temperature=0.7,
max_tokens=4000,
top_k=1,
together_api_key=together_key
)
# Use custom LLM and embeddings in TestsetGenerator
generator = TestsetGenerator.from_langchain(
generator_llm=together_completion,
critic_llm=together_completion,
embeddings=embeddings,
)
This setup allows you to leverage larger or more specialized LLMs for generating ground truth answers, ensuring the quality of your RAG pipeline [1].
Regarding your second question, to verify if the source documents for the "explodinggradients/fiqa - ragas_eval" dataset are included in the corpus subset of 57.6k rows, you can follow these steps:
Here is a sample code to perform these steps:
import pandas as pd
# Load the corpus subset of 57.6k rows
corpus_df = pd.read_csv('path_to_corpus_subset.csv')
# Load the source documents from the "explodinggradients/fiqa - ragas_eval" dataset
source_documents = pd.read_csv('path_to_source_documents.csv')
# Check for inclusion
included = source_documents['document_id'].isin(corpus_df['document_id'])
# Verify if all source documents are included
all_included = included.all()
print(f"All source documents are included in the corpus subset: {all_included}")
Replace 'path_to_corpus_subset.csv'
and 'path_to_source_documents.csv'
with the actual paths to your datasets. The document_id
should be replaced with the appropriate column that uniquely identifies each document in your datasets [2].
To continue talking to Dosu, mention @dosu.
It seems the issue was answered, closing this now.
[ ] I checked the documentation and related resources and couldn't find an answer to my question.
Context I will use my own retrieval and rerank model, such fine-tuned bge-embedding and bge-reranker. And I want to test it on "explodinggradients/fiqa - ragas_eval", maybe i will expand it on my own dataset too.
Your Question