explodinggradients / ragas

Evaluation library for your LLM Applications
https://docs.ragas.io
Apache License 2.0
6.86k stars 683 forks source link

Can not generate test set by Ollama #1456

Open CAICCU opened 4 days ago

CAICCU commented 4 days ago
from ragas.llms.prompt import Prompt
from ragas import evaluate
from ragas.metrics import (
    answer_relevancy,
    faithfulness,
    context_recall,
    context_precision,
)
from datasets import Dataset
import requests
import os
from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.document_loaders import DirectoryLoader
from ragas.testset import TestsetGenerator

# load models by ollama
ollama_llm = ChatOllama(model="qwen2:7b")
ollama_embeddings = OllamaEmbeddings(model="bge-m3")

loader = DirectoryLoader("/ragas/src/data")
documents = loader.load()
for document in documents:
    document.metadata['filename'] = document.metadata['source']

# generator
generator = TestsetGenerator.from_langchain(
    ollama_llm
)

# generate testset
testset = generator.generate_with_langchain_docs(documents, test_size=10)

testset.to_pandas()
testset.head()

and then has the error:

openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

in testset = generator.generate_with_langchain_docs(documents, test_size=10)

How can solve it?

dosubot[bot] commented 4 days ago

Hey there, @CAICCU! 👋 I'm here to help you with any bugs, questions, or contributions you need assistance with. Let's tackle this challenge together!

To resolve the OpenAIError related to the missing API key, you need to set the OPENAI_API_KEY environment variable. You can do this by adding the following code before using the TestsetGenerator.from_langchain method:

import os

os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"

Replace "YOUR_OPENAI_API_KEY_HERE" with your actual OpenAI API key. This should resolve the error you're encountering [1].

To continue talking to Dosu, mention @dosu.

CAICCU commented 4 days ago

?I want to use local model through ollama, not openai

dosubot[bot] commented 4 days ago

To use a local model with the TestsetGenerator in the Ragas framework, you should ensure that the model is properly wrapped and does not require an OpenAI API key. The TestsetGenerator can interact with local models through the LangchainLLMWrapper. Here's a general approach:

  1. Wrap Your Local Model: Ensure your local model is wrapped using LangchainLLMWrapper.

  2. Initialize the TestsetGenerator: Use the from_langchain method with your wrapped local model.

  3. Remove OpenAI Dependencies: Make sure your code does not include any OpenAI-specific configurations or API key settings.

Here's a simplified example:

from ragas.testset import TestsetGenerator
from langchain_community.chat_models import ChatOllama

# Initialize your local model
ollama_llm = ChatOllama(model="your_local_model")

# Initialize the TestsetGenerator with the local model
generator = TestsetGenerator.from_langchain(ollama_llm)

# Proceed with generating the test set
testset = generator.generate_with_langchain_docs(documents, test_size=10)

Ensure that your local model is compatible with the framework and does not rely on OpenAI's API.

To continue talking to Dosu, mention @dosu.

almariscal commented 1 day ago

@dosu it still answers asking for the necessary API key

almariscal commented 1 day ago

@CAICCU I have been testing this and I think you need a wrapper but also change llm_factory and embedding_factory functions. I commented the original code and changed it as follows and seems at least to start running.

Although wrapping the OllamaLLM inside LangchainLLMWrapper, the code then calls llm and embedding_factory and tries to run openai.

File ragas/llms/base.py

""" def llm_factory(
    model: str = "gpt-4o-mini",
    run_config: t.Optional[RunConfig] = None,
    default_headers: t.Optional[t.Dict[str, str]] = None,
    base_url: t.Optional[str] = None,
) -> BaseRagasLLM:
    timeout = None
    if run_config is not None:
        timeout = run_config.timeout

    # if helicone is enabled, use the helicone
    if helicone_config.is_enabled:
        default_headers = helicone_config.default_headers()
        base_url = helicone_config.base_url

    openai_model = ChatOpenAI(
        model=model, timeout=timeout, default_headers=default_headers, base_url=base_url
    )
    return LangchainLLMWrapper(openai_model, run_config)
 """

def llm_factory(
    model: str = "llama3",
    run_config: t.Optional[RunConfig] = None,
    default_headers: t.Optional[t.Dict[str, str]] = None,
    base_url: t.Optional[str] = None,
) -> BaseRagasLLM:
    timeout = 1500
    if run_config is not None:
        timeout = run_config.timeout

    # if helicone is enabled, use the helicone
    if helicone_config.is_enabled:
        default_headers = helicone_config.default_headers()
        base_url = helicone_config.base_url

    openai_model = ChatOpenAI(
        model=model, timeout=timeout, default_headers=default_headers, base_url="http://localhost:11434/v1", api_key="ollama"
    )
    return LangchainLLMWrapper(openai_model, run_config)

File ragas/embeddgins/base.py

""" def embedding_factory(
    model: str = "text-embedding-ada-002", run_config: t.Optional[RunConfig] = None
) -> BaseRagasEmbeddings:
    openai_embeddings = OpenAIEmbeddings(model=model)
    if run_config is not None:
        openai_embeddings.request_timeout = run_config.timeout
    else:
        run_config = RunConfig()
    return LangchainEmbeddingsWrapper(openai_embeddings, run_config=run_config) """

from langchain_huggingface.embeddings import HuggingFaceEmbeddings

def embedding_factory(
    model: str = "BAAI/bge-m3", run_config: t.Optional[RunConfig] = None
) -> BaseRagasEmbeddings:
    openai_embeddings = HuggingFaceEmbeddings(model_name=model)
    if run_config is not None:
        openai_embeddings.request_timeout = run_config.timeout
    else:
        run_config = RunConfig()
    return LangchainEmbeddingsWrapper(openai_embeddings, run_config=run_config)