Open princepride opened 5 months ago
wow that is an embarrassing bug @princepride - thanks a lot for reporting that 😅
will fix it shortly
wow that is an embarrassing bug @princepride - thanks a lot for reporting that 😅
will fix it shortly
I try to fix this bug and merge it. https://github.com/explodinggradients/ragas/pull/880
The code in testset/generator.py make me confused:
Your Question Assuming that in the code, the current_nodes index range generated according to each distribution probability traversal will be concentrated in the front part of the current_nodes list, and the later parts will never be accessed?
Code Examples from ragas.testset.generator import TestsetGenerator from ragas.testset.evolutions import simple, reasoning, multi_context from langchain_openai import ChatOpenAI, OpenAIEmbeddings
generator_llm = ChatOpenAI(model="gpt-3.5-turbo-16k") critic_llm = ChatOpenAI(model="gpt-4") embeddings = OpenAIEmbeddings()
generator = TestsetGenerator.from_langchain( generator_llm, critic_llm, embeddings )
testset = generator.generate_with_langchain_docs(documents, test_size=10, distributions={simple: 0.5, reasoning: 0.3, multi_context: 0.2})
Using the sample code from Ragas documents, the simple testset will use the No. 1-5 random selected documents, the reasoning testset will use the No. 1-3 random selected documents, the multi_context testset will use the No. 1-2 random selected documents. The No. 6-10 documents will never be used.
R-239