Open sarahzouinina opened 1 month ago
Hey @sarahzouinina! I'm here to help you with any bugs, questions, or contributions you have. Let's work on getting this issue sorted out together.
The error you're encountering, ValueError: Node 5219d5c4-2785-4cee-96a0-75c5cd73b3a5 has no summary_embedding
, indicates that the node in your knowledge graph is missing the required 'summary_embedding' property. In the Ragas framework, each node of type NodeType.DOCUMENT
must have a 'summary_embedding' to calculate cosine similarity between nodes [1].
To resolve this issue, ensure that your documents are processed with the EmbeddingExtractor
class configured as summary_embedder
to generate the necessary embeddings. This step is crucial in the transformation pipeline to create the 'summary_embedding' for each document node [1].
Regarding the version compatibility, there isn't specific information available about a version of Ragas that addresses this issue directly. However, you should ensure that you are using the latest version of Ragas, as it includes updates and changes that might affect the TestsetGenerator functionality, such as the introduction of a new Knowledge Graph and Transforms [2].
Make sure your setup aligns with these updates, and verify that all necessary transformations are applied to your documents before generating the test set.
To continue talking to Dosu, mention @dosu.
What specific version of Ragas are you using in the code you provided? Also, in the code you provided it looks like you never defined an embedding client.
[ ] I have checked the documentation and related resources and couldn't resolve my bug.
I was using the old version of Ragas to generate the evaluation data but it is not working anymore `generator_llm = AzureChatOpenAI(model="gpt-4o", api_version="2024-02-01") critic_llm = AzureChatOpenAI(model="gpt-4o", api_version="2024-02-01")
generator = TestsetGenerator.from_langchain( generator_llm, critic_llm, aoai_embeddings )
generate testset
testset = generator.generate_with_langchain_docs(documents, test_size=10, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25}) testset_df = testset.to_pandas()`
The issue is even when I follow the new documentation and the example in it, it doesn't work because i encounter this error folowing the test generation in the README ValueError: Node 5219d5c4-2785-4cee-96a0-75c5cd73b3a5 has no summary_embedding
I need to deliver an evaluation set but I can't, is because none of these is working, can you tell me which version should I install to solve my issue