explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.31k stars 745 forks source link

when use ragas0.2, unable to apply transformation: Connection error #1496

Open Z-oo883 opened 1 month ago

Z-oo883 commented 1 month ago

when I try to use ragas0.2, I have encountered a problem:unable to apply transformation: Connection error,what shoud I do to deal with it? enviroment: ragas=0.2,python=3.9 code:

from langchain.document_loaders import PyPDFLoader
from dotenv import load_dotenv
from langchain.embeddings import HuggingFaceEmbeddings

load_dotenv('.env')
loader = PyPDFLoader("xx.pdf")
docs = loader.load_and_split()

from ragas.llms import LangchainLLMWrapper
from langchain_openai import ChatOpenAI
chat = ChatOpenAI(
    model="Qwen2",
    temperature=0.3,
    openai_api_key="xxx",
    openai_api_base='xxx',
    stop=['<|im_end|>']
)
evaluator_llm = LangchainLLMWrapper(chat)
generator_llm = LangchainLLMWrapper(chat)

from ragas.testset import TestsetGenerator

generator = TestsetGenerator(llm=generator_llm)
dataset = generator.generate_with_langchain_docs(docs, testset_size=1)

dataset.to_pandas()

error:

Applying [SummaryExtractor, HeadlinesExtractor]:   0%|          | 0/2 [00:00<?, ?it/s]unable to apply transformation: Connection error.
unable to apply transformation: Connection error.
Applying EmbeddingExtractor:   0%|          | 0/1 [00:00<?, ?it/s]unable to apply transformation: node.property('summary') must be a string, found '<class 'NoneType'>'
Applying HeadlineSplitter:   0%|          | 0/1 [00:00<?, ?it/s]unable to apply transformation: 'headlines' property not found in this node
Applying [EmbeddingExtractor, KeyphrasesExtractor, TitleExtractor]:   0%|          | 0/3 [00:00<?, ?it/s]unable to apply transformation: Connection error.
Applying [EmbeddingExtractor, KeyphrasesExtractor, TitleExtractor]:  33%|███▎      | 1/3 [00:01<00:02,  1.43s/it]unable to apply transformation: Connection error.
unable to apply transformation: Connection error.
Applying CosineSimilarityBuilder:   0%|          | 0/1 [00:00<?, ?it/s]unable to apply transformation: Node 03722fc8-bc34-470b-8b5d-bcdebc192a37 has no embedding
Traceback (most recent call last):
  File "D:\RA_LLM\pythonProject\generate_test_data.py", line 129, in <module>
    dataset = generator.generate_with_langchain_docs(docs, testset_size=1)
  File "D:\anaconda\envs\ragas\lib\site-packages\ragas\testset\synthesizers\generate.py", line 89, in generate_with_langchain_docs
    apply_transforms(kg, transforms)
  File "D:\anaconda\envs\ragas\lib\site-packages\ragas\testset\transforms\engine.py", line 102, in apply_transforms
    transform.generate_execution_plan(kg),
  File "D:\anaconda\envs\ragas\lib\site-packages\ragas\testset\transforms\base.py", line 325, in generate_execution_plan
    filtered_kg = self.filter(kg)
  File "D:\anaconda\envs\ragas\lib\site-packages\ragas\testset\transforms\relationship_builders\cosine.py", line 122, in filter
    raise ValueError(f"Node {node.id} has no {self.property_name}")
ValueError: Node 03722fc8-bc34-470b-8b5d-bcdebc192a37 has no summary_embedding
dosubot[bot] commented 1 month ago

The "Connection error" you're encountering when applying transformations in the ragas library could be due to several reasons, such as network issues, incorrect API keys, or misconfigured endpoints. Here are some steps you can take to troubleshoot and resolve the issue:

  1. Check API Key and Endpoint: Ensure that your OpenAI API key and base URL are correctly set. Double-check the values in your .env file and make sure they are loaded properly.

  2. Network Connectivity: Verify that your network connection is stable and that there are no firewall or proxy settings blocking the connection to the OpenAI API.

  3. Debugging Mode: Enable debugging mode in ragas to get more detailed logs. Set the environment variable __RAGAS_DEBUG_TRACKING to True to enable detailed logging, which might provide more insights into the connection issues.

  4. Error Handling: The ragas library uses a decorator to handle errors silently. You might want to modify the code to raise exceptions for debugging purposes. This can help you identify the exact point of failure.

  5. Review Logs: Check the logs for any additional error messages or stack traces that might indicate the source of the problem.

  6. Test with Simple Setup: Try running a minimal setup with a simple document and transformation to see if the issue persists. This can help isolate the problem to specific parts of your code or configuration.

If the issue continues, consider testing with a different API or service to rule out specific service-related issues.

To continue talking to Dosu, mention @dosu.

hunter-walden2113 commented 1 month ago

@Z-oo883 = I was dealing with the same problem. It seems like in the 0.2.0 version, they did some refactoring and got rid of the embedding functionality that used to be part of the TestsetGenerator class in previous versions. I am not seeing the ability to pass an embedding model to the TestsetGenerator in the 0.2.0 version. I had to downgrade to use the 0.1.21 version instead to get it to work. With the new 0.2.0 version, I am not sure how you are supposed to create a testset with TestsetGenerator, if you cant either actively embed your text, or pass in embeddings somewhere earlier in the pipeline.

nbarnett19 commented 1 month ago

@Z-oo883 I have the same issue. ERROR:ragas.testset.transforms.engine:unable to apply transformation: Connection error.

Also facing other errors in the process: ERROR:ragas.prompt.pydantic_prompt:Prompt fix_output_format failed to parse output: The output parser failed to parse the output after 0 retries. ERROR:ragas.prompt.pydantic_prompt:Prompt headlines_extractor_prompt failed to parse output: The output parser failed to parse the output after 0 retries. ERROR:ragas.testset.transforms.engine:unable to apply transformation: The output parser failed to parse the output after 0 retries. ERROR:ragas.prompt.pydantic_prompt:Prompt fix_output_format failed to parse output: The output parser failed to parse the output after 0 retries. ERROR:ragas.prompt.pydantic_prompt:Prompt headlines_extractor_prompt failed to parse output: The output parser failed to parse the output after 0 retries. ERROR:ragas.testset.transforms.engine:unable to apply transformation: The output parser failed to parse the output after 0 retries. ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'headlines' property not found in this node ERROR:ragas.testset.transforms.engine:unable to apply transformation: 'headlines' property not found in this node

I am following the updated how-to tutorial for test set generation without luck. I will try downgrading as @hunter-walden2113 suggested

hunter-walden2113 commented 1 month ago

This issue is due to part of the code associated with the generate_with_langchain_docs method. That function is invoking a default LLM in the background to which there is no API key in your environment. There are many open issues regarding this problem. Working to fix it. In the meantime you can downgrade to the previous Ragas version to generate your test set.