explodinggradients / ragas

Evaluation framework for your Retrieval Augmented Generation (RAG) pipelines
https://docs.ragas.io
Apache License 2.0
5.68k stars 529 forks source link

Error in Testset Generation - ExceptionInRunner: The runner thread which was running the jobs raised an exeception. #735

Open jaymon0703 opened 3 months ago

jaymon0703 commented 3 months ago

[X] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug test_generator.generate_with_langchain_docs() returns RuntimeError.

Please note i am using VertexAI models.

RuntimeError: Task <Task pending name='Task-13277' coro=<BaseChatModel._agenerate_with_cache() running at [c:\Users\MASKED\.venv\lib\site-packages\langchain_core\language_models\chat_models.py:617](file:///C:/Users/MASKED/.venv/lib/site-packages/langchain_core/language_models/chat_models.py:617)> cb=[gather.<locals>._done_callback() at [C:\Users\MASKED\AppData\Local\Programs\Python\Python310\lib\asyncio\tasks.py:718](file:///C:/Users/MASKED/AppData/Local/Programs/Python/Python310/lib/asyncio/tasks.py:718)]> got Future <Task pending name='Task-13278' coro=<UnaryUnaryCall._invoke() running at [c:\Users\MASKED\.venv\lib\site-packages\grpc\aio\_call.py:566](file:///C:/Users/MASKED/.venv/lib/site-packages/grpc/aio/_call.py:566)>> attached to a different loop

Ragas version: 0.1.3 Python version: 3.10.0

Code to Reproduce

from langchain.document_loaders import DirectoryLoader
loader = DirectoryLoader("<your_diretory_here>")
documents = loader.load()

from google.cloud import aiplatform
from langchain.llms import VertexAI
from langchain.chat_models import ChatVertexAI
from langchain.chains import APIChain
from langchain.embeddings import VertexAIEmbeddings

aiplatform.init(
    # your Google Cloud Project ID or number
    # environment default used is not set
    project='<your_project_id_here>',
    location="us-central1"
)

# create Langchain LLM and Embeddings
ragas_vertexai_llm = ChatVertexAI()
vertexai_embeddings = VertexAIEmbeddings()

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings.base import LangchainEmbeddingsWrapper

ragas_vertexai_llm = LangchainLLMWrapper(ragas_vertexai_llm)
vertexai_embeddings = LangchainEmbeddingsWrapper(vertexai_embeddings)

from ragas.testset.extractor import KeyphraseExtractor
from langchain.text_splitter import TokenTextSplitter
from ragas.testset.docstore import InMemoryDocumentStore

splitter = TokenTextSplitter(chunk_size=1000, chunk_overlap=100)
keyphrase_extractor = KeyphraseExtractor(llm=ragas_vertexai_llm)

docstore = InMemoryDocumentStore(
    splitter=splitter,
    embeddings=vertexai_embeddings,
    extractor=keyphrase_extractor,
)

from ragas.testset import TestsetGenerator
from ragas.testset.evolutions import simple, reasoning, multi_context

test_generator = TestsetGenerator(
    generator_llm=ragas_vertexai_llm,
    critic_llm=ragas_vertexai_llm,
    embeddings=vertexai_embeddings,
    docstore=docstore,
)

distributions = {simple: 0.5, reasoning: 0.25, multi_context: 0.25}

# use generator.generate_with_llamaindex_docs if you use llama-index as document loader
testset = test_generator.generate_with_langchain_docs(
    documents=documents, test_size=10, distributions=distributions
)
shahules786 commented 3 months ago

Can you share the OS info @jaymon0703 too? Are you facing this issue everytime?

jaymon0703 commented 3 months ago

Thank you @shahules786

Error manifests every time.

{'platform': 'Windows', 'platform-release': '10', 'platform-version': '10.0.19045', 'architecture': 'AMD64', 'processor': 'Intel64 Family 6 Model 142 Stepping 12, GenuineIntel', 'ram': '16 GB'}

HRUSHI1212 commented 3 months ago

I am also facing the same issue and i am using GPT models with python 3.10 on collab.

jaymon0703 commented 3 months ago

If anyone has VertexAI examples for testset generation and evaluation it would be great to see it. Currently getting errors on both, with above for testset generation.

bdeck8317 commented 3 months ago

I am also getting this same error. Any resolution? @jaymon0703

HRUSHI1212 commented 3 months ago

while using gpt use ragas==0.1.4 langchain==0.1.5 it worked for me

jaymon0703 commented 3 months ago

Sorry what is "GPT" you are referring to @HRUSHI1212?

HRUSHI1212 commented 3 months ago

Sorry what is "GPT" you are referring to @HRUSHI1212?

While generating synthetic dataset I am using GPT-4 and gpt 3.5 turbo which is the default one.with the versions ragas==0.1.4 langchain==0.1.5

jaymon0703 commented 2 months ago

Hey @shahules786 is it possible to update VertexAI docs with working code for generating testset? Thanks!

wahidur028 commented 1 month ago

Hi there,

I am also facing the same error: "The runner thread which was running the jobs raised an exception."

When I use two different models for generator_llm and critic_llm and set test_size=2, the code works. However, in other cases, such as using the same model for both generator_llm and critic_llm or setting test_size to 3, 5, or 10, the code breaks and generates the error.