Open CAICCU opened 1 month ago
Hey there, @CAICCU! 👋 I'm here to help you with any bugs, questions, or contributions you need assistance with. Let's tackle this challenge together!
To resolve the OpenAIError
related to the missing API key, you need to set the OPENAI_API_KEY
environment variable. You can do this by adding the following code before using the TestsetGenerator.from_langchain
method:
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY_HERE"
Replace "YOUR_OPENAI_API_KEY_HERE"
with your actual OpenAI API key. This should resolve the error you're encountering [1].
To continue talking to Dosu, mention @dosu.
?I want to use local model through ollama, not openai
To use a local model with the TestsetGenerator
in the Ragas framework, you should ensure that the model is properly wrapped and does not require an OpenAI API key. The TestsetGenerator
can interact with local models through the LangchainLLMWrapper
. Here's a general approach:
Wrap Your Local Model: Ensure your local model is wrapped using LangchainLLMWrapper
.
Initialize the TestsetGenerator: Use the from_langchain
method with your wrapped local model.
Remove OpenAI Dependencies: Make sure your code does not include any OpenAI-specific configurations or API key settings.
Here's a simplified example:
from ragas.testset import TestsetGenerator
from langchain_community.chat_models import ChatOllama
# Initialize your local model
ollama_llm = ChatOllama(model="your_local_model")
# Initialize the TestsetGenerator with the local model
generator = TestsetGenerator.from_langchain(ollama_llm)
# Proceed with generating the test set
testset = generator.generate_with_langchain_docs(documents, test_size=10)
Ensure that your local model is compatible with the framework and does not rely on OpenAI's API.
To continue talking to Dosu, mention @dosu.
@dosu it still answers asking for the necessary API key
@CAICCU I have been testing this and I think you need a wrapper but also change llm_factory and embedding_factory functions. I commented the original code and changed it as follows and seems at least to start running.
Although wrapping the OllamaLLM inside LangchainLLMWrapper, the code then calls llm and embedding_factory and tries to run openai.
File ragas/llms/base.py
""" def llm_factory(
model: str = "gpt-4o-mini",
run_config: t.Optional[RunConfig] = None,
default_headers: t.Optional[t.Dict[str, str]] = None,
base_url: t.Optional[str] = None,
) -> BaseRagasLLM:
timeout = None
if run_config is not None:
timeout = run_config.timeout
# if helicone is enabled, use the helicone
if helicone_config.is_enabled:
default_headers = helicone_config.default_headers()
base_url = helicone_config.base_url
openai_model = ChatOpenAI(
model=model, timeout=timeout, default_headers=default_headers, base_url=base_url
)
return LangchainLLMWrapper(openai_model, run_config)
"""
def llm_factory(
model: str = "llama3",
run_config: t.Optional[RunConfig] = None,
default_headers: t.Optional[t.Dict[str, str]] = None,
base_url: t.Optional[str] = None,
) -> BaseRagasLLM:
timeout = 1500
if run_config is not None:
timeout = run_config.timeout
# if helicone is enabled, use the helicone
if helicone_config.is_enabled:
default_headers = helicone_config.default_headers()
base_url = helicone_config.base_url
openai_model = ChatOpenAI(
model=model, timeout=timeout, default_headers=default_headers, base_url="http://localhost:11434/v1", api_key="ollama"
)
return LangchainLLMWrapper(openai_model, run_config)
File ragas/embeddgins/base.py
""" def embedding_factory(
model: str = "text-embedding-ada-002", run_config: t.Optional[RunConfig] = None
) -> BaseRagasEmbeddings:
openai_embeddings = OpenAIEmbeddings(model=model)
if run_config is not None:
openai_embeddings.request_timeout = run_config.timeout
else:
run_config = RunConfig()
return LangchainEmbeddingsWrapper(openai_embeddings, run_config=run_config) """
from langchain_huggingface.embeddings import HuggingFaceEmbeddings
def embedding_factory(
model: str = "BAAI/bge-m3", run_config: t.Optional[RunConfig] = None
) -> BaseRagasEmbeddings:
openai_embeddings = HuggingFaceEmbeddings(model_name=model)
if run_config is not None:
openai_embeddings.request_timeout = run_config.timeout
else:
run_config = RunConfig()
return LangchainEmbeddingsWrapper(openai_embeddings, run_config=run_config)
I have a similar issue as well with AzureChatOpenAI
from langchain
, even though I followed what is said here: https://github.com/explodinggradients/ragas/issues/1456#issuecomment-2402812688.
I am still getting:
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
.
Is there a fix?
Even I am getting this error while running below code (ragas=0.2.0
):
generator_llm = ChatOllama(model="llama3")
generator = TestsetGenerator(llm=generator_llm)
dataset = generator.generate_with_langchain_docs(chunked_documents, testset_size=10)
Here are the logs:
OpenAIError Traceback (most recent call last)
Cell In[11], line 2
1 generator = TestsetGenerator(llm=generator_llm)
----> 2 dataset = generator.generate_with_langchain_docs(chunked_documents, testset_size=10)
...
File ~/RAG_VENV/lib/python3.10/site-packages/ragas/testset/transforms/__init__.py:43, in default_transforms()
40 from ragas.testset.graph import NodeType
42 # define the transforms
---> 43 summary_extractor = SummaryExtractor()
44 keyphrase_extractor = KeyphrasesExtractor()
45 title_extractor = TitleExtractor()
...
OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
Somehow, the SummaryExtractor
is not using the provided Ollam LLM model and instead trying to connect to OpenAI.
hey @DarshanAdiga could you update to the latest version?
what is happening is that transforms is using a different model in your case. It should default to generator_llm but I'm guessing there was a bug with v0.2.0
could you update and try again?
@MariosKef The error is the same for you too. check https://docs.ragas.io/en/stable/references/generate/#ragas.testset.synthesizers.generate.TestsetGenerator.generate_with_langchain_docs. That function has a transforms_llm
argument which should help
To use a local model with the TestsetGenerator
in the Ragas framework without requiring an API key, ensure that your local model is correctly set up and does not depend on any external API services. If you're using the from_langchain
method, make sure your local model is wrapped appropriately and does not include any OpenAI-specific configurations.
If the error persists, consider using the from_llama_index
method if your local model is compatible with LlamaIndex, as this method is designed to work with local models without requiring an API key. Here's a basic example:
from ragas.testset import TestsetGenerator
from your_local_model_library import YourLocalModel, YourLocalEmbedding
# Initialize your local model and embedding
local_llm = YourLocalModel()
local_embedding = YourLocalEmbedding()
# Initialize the TestsetGenerator with the local model
generator = TestsetGenerator.from_llama_index(
llm=local_llm,
embedding_model=local_embedding
)
# Generate the test set
testset = generator.generate_with_llamaindex_docs(documents, test_size=10)
Ensure that your local model and embedding are compatible with the framework and do not require any API keys [1].
To continue talking to Dosu, mention @dosu.
and then has the error:
in
testset = generator.generate_with_langchain_docs(documents, test_size=10)
How can solve it?