explodinggradients / ragas

Supercharge Your LLM Application Evaluations 🚀
https://docs.ragas.io
Apache License 2.0
7.31k stars 745 forks source link

LangChain v0.3 not supported. For example, TestsetGenerator raise ExceptionInRunner with LangChain v0.3. #1328

Closed os1ma closed 2 months ago

os1ma commented 2 months ago

Describe the bug

LangChain recently released v0.3.When using LangChain v0.3, TestsetGenerator raises an ExceptionInRunner.

From v0.3, LangChain internally use pydantic v2. On the other hand, ragas internally uses langchain_core.pydantic_v1. This might be the cause of the error.

LangChain v0.3 migration guide is here. https://python.langchain.com/docs/versions/v0_3/

Using LangChain v0.3 also introduces numerous potential errors, and it's necessary to update ragas to be compatible with LangChain v0.3.

Versions

I encountered this error on Google Colab.

!python --version
Python 3.10.12
!pip install langchain-core==0.3.0 langchain-openai==0.2.0 \
    langchain-community==0.3.0 ragas==0.1.14 nest-asyncio==1.6.0

ragas 0.1.18 (latest) also raise the same Error.

Code to Reproduce Share code to reproduce the issue

import os
from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")
import nest_asyncio
from langchain_core.documents import Document
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator

documents = [Document(page_content="sample", metadata={"source": "sample"})]

for document in documents:
    document.metadata["filename"] = document.metadata["source"]

nest_asyncio.apply()

generator = TestsetGenerator.from_langchain(
    generator_llm=ChatOpenAI(model="gpt-4o-mini"),
    critic_llm=ChatOpenAI(model="gpt-4o-mini"),
    embeddings=OpenAIEmbeddings(),
)

testset = generator.generate_with_langchain_docs(
    documents,
    test_size=4,
    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)

Error trace

/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAIBase has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:132: UserWarning: Field "model_name" in _VertexAICommon has conflict with protected namespace "model_".

You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:1: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain_core.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.

For example, replace imports like: `from langchain_core.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet.     from pydantic.v1 import BaseModel

  from ragas.metrics._answer_correctness import AnswerCorrectness, answer_correctness
/usr/local/lib/python3.10/dist-packages/ragas/metrics/__init__.py:4: LangChainDeprecationWarning: As of langchain-core 0.3.0, LangChain uses pydantic v2 internally. The langchain.pydantic_v1 module was a compatibility shim for pydantic v1, and should no longer be used. Please update the code to import from Pydantic directly.

For example, replace imports like: `from langchain.pydantic_v1 import BaseModel`
with: `from pydantic import BaseModel`
or the v1 compatibility namespace if you are working in a code base that has not been fully upgraded to pydantic 2 yet.     from pydantic.v1 import BaseModel

  from ragas.metrics._context_entities_recall import (
---------------------------------------------------------------------------
ExceptionInRunner                         Traceback (most recent call last)
<ipython-input-3-e09f8b7c4199> in <cell line: 14>()
     12 )
     13 
---> 14 testset = generator.generate_with_langchain_docs(
     15     documents,
     16     test_size=4,

2 frames
/usr/local/lib/python3.10/dist-packages/ragas/testset/docstore.py in add_nodes(self, nodes, show_progress)
    251         results = executor.results()
    252         if not results:
--> 253             raise ExceptionInRunner()
    254 
    255         for i, n in enumerate(nodes):

ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass `raise_exceptions=False` incase you want to show only a warning message instead.

Expected behavior

Generating testset success without any Error.

Additional context

Based on my investigation, I found that at least the implementation of default values for the Document class and Node class in ragas/testset/docstore.py is causing incorrect behavior.

https://github.com/explodinggradients/ragas/blob/c40891bf168de3124c845b75af31ed557eb79709/src/ragas/testset/docstore.py#L31-L33

https://github.com/explodinggradients/ragas/blob/c40891bf168de3124c845b75af31ed557eb79709/src/ragas/testset/docstore.py#L82-L86

For example, the embedding of Document is supposed to have None as its default value. However, for some reason, an instance of a Field object is being set instead. As a result, in the following section, the condition "n.embedding is None" is evaluated as False, leading to incorrect behavior.

https://github.com/explodinggradients/ragas/blob/c40891bf168de3124c845b75af31ed557eb79709/src/ragas/testset/docstore.py#L233

It seems that using langchain_core.pydantic_v1 is also causing issues with the values of Fields in classes that inherit from BaseModel.

Same error issue

The same error is described in the following issue.

https://github.com/explodinggradients/ragas/issues/1319

dosubot[bot] commented 2 months ago

Hello, @os1ma! I'm a bot designed to help you solve bugs, answer questions, and become a contributor. I'm here to assist you while waiting for a human maintainer.

I found similar issues related to the ExceptionInRunner error when using LangChain v0.3 with Ragas. Here are the relevant discussions and solutions:

  1. Open Issue: The runner thread which was running the jobs raised an exception

    • Solution: Pass the parameter raise_exceptions=False to show only a warning message instead of raising an exception. This can help you debug the issue by providing a warning message without stopping the execution.
  2. Closed Issue: ExceptionInRunner: The runner thread which was running the jobs raised an exception

    • Solution: Install and apply nest-asyncio before your Ragas calls:
      pip install nest-asyncio
      import nest_asyncio
      nest_asyncio.apply()

To resolve the ExceptionInRunner error in TestsetGenerator when using LangChain v0.3, you need to update the imports to use Pydantic v2 directly. Here are the steps:

  1. Update Imports in docstore.py:

    from pydantic import Field
  2. Update TestsetGenerator Class: Ensure that all Pydantic models and their usage are updated to be compatible with Pydantic v2. Here is an example:

    from pydantic import BaseModel
    
    class TestsetGenerator(BaseModel):
       generator_llm: BaseRagasLLM
       critic_llm: BaseRagasLLM
       embeddings: BaseRagasEmbeddings
       docstore: DocumentStore
    
       @classmethod
       def from_langchain(
           cls,
           generator_llm: LangchainLLM,
           critic_llm: LangchainLLM,
           embeddings: LangchainEmbeddings,
           docstore: t.Optional[DocumentStore] = None,
           run_config: RunConfig = RunConfig(),
           chunk_size: int = 1024,
       ) -> "TestsetGenerator":
           generator_llm_model = LangchainLLMWrapper(generator_llm)
           critic_llm_model = LangchainLLMWrapper(critic_llm)
           embeddings_model = LangchainEmbeddingsWrapper(embeddings)
    
           keyphrase_extractor = KeyphraseExtractor(llm=generator_llm_model)
           if docstore is None:
               from langchain.text_splitter import TokenTextSplitter
    
               splitter = TokenTextSplitter(chunk_size=chunk_size, chunk_overlap=0)
               docstore = InMemoryDocumentStore(
                   splitter=splitter,
                   embeddings=embeddings_model,
                   extractor=keyphrase_extractor,
                   run_config=run_config,
               )
           return cls(
               generator_llm=generator_llm_model,
               critic_llm=critic_llm_model,
               embeddings=embeddings_model,
               docstore=docstore,
           )
    
       @classmethod
       def from_llama_index(
           cls,
           generator_llm: LlamaindexLLM,
           critic_llm: LlamaindexLLM,
           embeddings: LlamaIndexEmbeddings,
           docstore: t.Optional[DocumentStore] = None,
           run_config: RunConfig = RunConfig(),
       ) -> "TestsetGenerator":
           generator_llm_model = LlamaIndexLLMWrapper(generator_llm)
           critic_llm_model = LlamaIndexLLMWrapper(critic_llm)
           embeddings_model = LlamaIndexEmbeddingsWrapper(embeddings)
           keyphrase_extractor = KeyphraseExtractor(llm=generator_llm_model)
           if docstore is None:
               from langchain.text_splitter import TokenTextSplitter
    
               splitter = TokenTextSplitter(chunk_size=1024, chunk_overlap=0)
               docstore = InMemoryDocumentStore(
                   splitter=splitter,
                   embeddings=embeddings_model,
                   extractor=keyphrase_extractor,
                   run_config=run_config,
               )
           return cls(
               generator_llm=generator_llm_model,
               critic_llm=critic_llm_model,
               embeddings=embeddings_model,
               docstore=docstore,
           )
  3. Update Other Imports: Ensure that other parts of the codebase also import from Pydantic directly. For example, in prompts.py and _answer_relevance.py:

    from pydantic import BaseModel

By making these updates, you should be able to resolve the ExceptionInRunner error and ensure compatibility with LangChain v0.3 and Pydantic v2 [1][2][3][4].

To continue talking to Dosu, mention @dosu.

frieda-huang commented 2 months ago

I'm having the same issue as well!

wanjeakshay commented 2 months ago

@frieda-huang Which filetype are you trying for test data generation. Is it pdf ?

shahules786 commented 2 months ago

Hey, We will take a look at this soon. The CI are also failing for the same reason. For now please keep using langchain <0.3 with ragas. We are also preparing to release a new and improved version of test data generation with v0.2 release (in 2 weeks) https://github.com/explodinggradients/ragas/pull/1321

jjmachan commented 2 months ago

thanks a lot for bringing this up 🙌🏽. the current strategy for the fix is as follows

The new langchain v0.3 will break the current usage of metrics. the plan for actions is as follows

  • for ragas<0.2 we will pin langchain_core to <0.3
  • for ragas>0.2 we will directly depend on pydantic>=2

cut a new release with the dependency as well

frieda-huang commented 2 months ago

Yes.

On Wed, Sep 18, 2024 at 12:33 AM Akshay Wanje @.***> wrote:

@frieda-huang https://github.com/frieda-huang Which filetype are you trying for test data generation. Is it pdf ?

— Reply to this email directly, view it on GitHub https://github.com/explodinggradients/ragas/issues/1328#issuecomment-2357474632, or unsubscribe https://github.com/notifications/unsubscribe-auth/A5VHN6E6IT7HPIJ3NEF2C7DZXD7B5AVCNFSM6AAAAABOMQOOB2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJXGQ3TINRTGI . You are receiving this because you were mentioned.Message ID: @.***>

wanjeakshay commented 2 months ago

@frieda-huang I am also stuck at the same thing since 2 days. I don't know if ragas support pdf file ingestion for test data generation. convert that pdf to txt file and try it is working.

frieda-huang commented 2 months ago

@frieda-huang I am also stuck at the same thing since 2 days. I don't know if ragas support pdf file ingestion for test data generation. convert that pdf to txt file and try it is working.

Test data generation on PDFs works for me, but it's super slow. I'm generating test data from 3 papers based off CVPR 2019 Papers. I'm using open-sourced models, so the output is not ideal but somewhat useful.

My code looks something like this:

from langchain_community.document_loaders import DirectoryLoader
from langchain_ollama import OllamaLLM
from ragas.testset.evolutions import multi_context, reasoning, simple
from ragas.testset.generator import TestsetGenerator
from langchain_community.embeddings import HuggingFaceInferenceAPIEmbeddings

loader = DirectoryLoader(DATA_DIR)
docs = loader.load()

llm_model = "llama3.1:8b-instruct-fp16"
llm = OllamaLLM(model=llm_model)

embed_model = "sentence-transformers/all-MiniLM-l6-v2"
embeddings = HuggingFaceInferenceAPIEmbeddings(
    api_key=HF_API_KEY,
    model_name=embed_model,
)

generator = TestsetGenerator.from_langchain(
    generator_llm=llm, critic_llm=llm, embeddings=embeddings
)

testset = generator.generate_with_langchain_docs(
    docs,
    test_size=20,
    distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25},
)

df = testset.to_pandas()
df.to_csv("testset_output.csv", index=False)

Output looks like follows:

learning rule defined as a function of the perceptual prediction error defined in Section 3.2 and is defined as\n\nλlearn = \uf8f1 \uf8f4\uf8f2\n\n− t λinit, EP (t) > µe t λinit, EP (t) < µe otherwise\n\n∆ ∆+ λinit,\n\n\uf8f4\uf8f3\n\n− t , ∆+\n\nt and λinit refer to the scaling of the learning where ∆ rate in the negative direction, positive direction and the ini- t2 tial learning rate respectively and µe = 1 t1 EP dEP . The learning rate is adjusted based on the quality of the predictions characterized by the perceptual prediction er- ror between a temporal sequence between times t1 and t2, typically defined by the gating signal.. The impact of the adaptive changes to the learning rate is shown in the quan- titative evaluation Section 4.4, where the adaptive learn- ing scheme shows improvement of up to 20% compared to training without the learning scheme.\n\nt2−t1\n\nR\n\n3.5.

For context, I also downgraded versions of langchain and langchain-ollama

langchain = "0.2.11"
langchain-ollama = "0.1.3"
langchain-huggingface = "0.0.3"
wanjeakshay commented 2 months ago

@frieda-huang Did you face connection timeout issue while running the models ConnectError: All connection attempts failed

The above exception was the direct cause of the following exception:

ConnectError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions() 87 88 message = str(exc) ---> 89 raise mapped_exc(message) from exc 90 91

ConnectError: All connection attempts failed

frieda-huang commented 2 months ago

@frieda-huang Did you face connection timeout issue while running the models ConnectError: All connection attempts failed

The above exception was the direct cause of the following exception:

ConnectError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions() 87 88 message = str(exc) ---> 89 raise mapped_exc(message) from exc 90 91

ConnectError: All connection attempts failed

I don't have that issue, but it's behaving very weirdly. I'm now generating 50 tests based on 10 papers. I let my laptop (Apple M2) run throughout the night, it's still 32% in generation. It would sometimes tell me generation fails and return None and then progress more in generation. Now, I just hope nothing in my code lead to crash. I'm also caching the embeddings but it doesn't seem to improve the performance.

frieda-huang commented 2 months ago

@frieda-huang Did you face connection timeout issue while running the models ConnectError: All connection attempts failed

The above exception was the direct cause of the following exception:

ConnectError Traceback (most recent call last) /usr/local/lib/python3.10/dist-packages/httpx/_transports/default.py in map_httpcore_exceptions() 87 88 message = str(exc) ---> 89 raise mapped_exc(message) from exc 90 91

ConnectError: All connection attempts failed

New Update:

I got the following error after 10 hours of generating :(

    result = await callable(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 143, in evolve
    ) = await self._aevolve(current_tries, current_nodes)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 554, in _aevolve
    result = await self._acomplex_evolution(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 411, in _acomplex_evolution
    return await self.aretry_evolve(current_tries, current_nodes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 121, in aretry_evolve
    return await self._aevolve(current_tries, current_nodes)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 554, in _aevolve
    result = await self._acomplex_evolution(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 382, in _acomplex_evolution
    simple_question, current_nodes, _ = await self.se._aevolve(
                                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/evolutions.py", line 298, in _aevolve
    passed = await self.node_filter.filter(merged_node)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/friedahuang/Documents/csye7230/.venv/lib/python3.12/site-packages/ragas/testset/filters.py", line 60, in filter
    output["score"] = sum(output.values()) / len(output.values())
                      ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~
ZeroDivisionError: division by zero
sys:1: RuntimeWarning: coroutine 'Executor.wrap_callable_with_index.<locals>.wrapped_callable_async' was never awaited
jjmachan commented 2 months ago

hey @frieda-huang @wanjeakshay we are really sorry about this but looks like something broke in the testset generation part.

we have a new version of this coming soon #1321 and I would suggest you wait for that sadly 🙁

frieda-huang commented 2 months ago

hey @frieda-huang @wanjeakshay we are really sorry about this but looks like something broke in the testset generation part.

we have a new version of this coming soon #1321 and I would suggest you wait for that sadly 🙁

Thank you, @jjmachan! That would be very much appreciated! Do we know when the new version will be released?

jjmachan commented 2 months ago

we are aiming for this month end 🤞🏽