run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
35.45k stars 5k forks source link

[Question]: Unable to evaluate RAG Application. #15893

Open thetechinsight opened 1 week ago

thetechinsight commented 1 week ago

Question Validation

Question

Hello everyone,

I've developed two RAG (Naive RAG and Advance RAG) applications using the following technologies: LLM Framework: llama-index LLM Model: Llama3.1 (8b) using Ollama Embedding Model: OllamaEmbedding

I’m encountering the following error when running the evaluation and I’m not using OpenAI API directly in my code; however, it seems that llama-index might be using OpenAI internally.

 > npx promptfoo@latest eval
[░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 0% | ETA: 0s | 0/1 |  ""
[░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░] 0% | ETA: 0s | 0/1 |  ""
Error running Python script: openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-Y8GwA*********}
Stack Trace: Error: openai.AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided: sk-Y8GwA******************}
    at PythonShell.parseError (C:\Users\xyz\AppData\Local\npm-cache\_npx\81bbc6515d992ace\node_modules\python-shell\index.js:303:21)
    at terminateIfNeeded (C:\Users\xyz\AppData\Local\npm-cache\_npx\81bbc6515d992ace\node_modules\python-shell\index.js:193:32)
    at ChildProcess.<anonymous> (C:\Users\xyz\AppData\Local\npm-cache\_npx\81bbc6515d992ace\node_modules\python-shell\index.js:185:13)
    at ChildProcess.emit (node:events:519:28)
    at ChildProcess._handle.onexit (node:internal/child_process:294:12)
    --Python Traceback: --
    File "C:\Users\XYZ\AppData\Local\npm-cache\_npx\81bbc6515d992ace\node_modules\promptfoo\dist\src\python\wrapper.py", line 35, in <modu>
      result = call_method(script_path, method_name, *data)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    File "C:\Users\XYZ\AppData\Local\npm-cache\_npx\81bbc6515d992ace\node_modules\promptfoo\dist\src\python\wrapper.py", line 24, in call_d
      return method_to_call(*args)
             ^^^^^^^^^^^^^^^^^^^^^
    File "retrieve-naive-rag.py", line 18, in call_api
      response = new_query_engine.query(prompt)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In the following code, I create a vector database and store it locally:

               Settings.llm=Ollama(model=selected_model, request_timeout=300.0)
                Settings.embed_model = OllamaEmbedding(
                    model_name="gemma2:2b",
                    base_url="http://localhost:11434",
                    ollama_additional_kwargs={"mirostat": 0},
                )
                Settings.node_parser = SentenceSplitter(chunk_size=512, chunk_overlap=20)
                Settings.num_output = 512
                Settings.context_window = 3900

                index = VectorStoreIndex.from_documents(docs, embed_model=Settings.embed_model ,llm=Settings.llm, node_parser=Settings.node_parser, num_output=Settings.num_output, context_window=Settings.context_window,show_progress=True)
                index.set_index_id("naive-rag")
                index.storage_context.persist("./naive-rag-db")

To evaluate RAG application, I am using Promptfoo , so here is my the retriever.py file I am using in the promptfooconfig.yaml:

from llama_index.core import StorageContext, load_index_from_storage

def call_api(prompt, options, context):
    storage_context = StorageContext.from_defaults(persist_dir="./naive-rag-db")
    simple_vc_index = load_index_from_storage(storage_context, index_id="naive-rag")
    new_query_engine = simple_vc_index.as_query_engine()
    response = new_query_engine.query(prompt)
    result = {
        "output": response.content,
    }
    return result

Could you please help me diagnose why this error is occurring and how I might resolve it?

Thank you! Ankur

dosubot[bot] commented 1 week ago

Hello, @thetechinsight! I'm here to help you with any bugs, questions, or contributions you have. Let's work together to solve this issue.

I found some similar issues and discussions that might help you diagnose and resolve the openai.AuthenticationError you are encountering:

  1. Similar Unsolved Discussion:

  2. Closed Issues with Solutions:

  3. General Issue with Defaulting to OpenAI:

    • The LangchainEmbedding class might default to using OpenAI if not properly configured. Ensure that the OllamaEmbedding class is correctly set up to avoid this fallback.

To resolve your issue, ensure that the OllamaEmbedding model is correctly configured and that no part of your code or its dependencies is attempting to use OpenAI's services. Here is an example of how to set up the OllamaEmbedding model:

from llama_index.legacy.embeddings.ollama_embedding import OllamaEmbedding

# Initialize the OllamaEmbedding model
ollama_embedding = OllamaEmbedding(
    model_name="gemma2:2b",
    base_url="http://localhost:11434",  # Ensure this is the correct URL for your Ollama model
    ollama_additional_kwargs={"mirostat": 0}  # Optional, additional kwargs
)

# Example usage to get text embedding
text = "Sample text to embed"
embedding = ollama_embedding._get_text_embedding(text)
print(embedding)

# Example usage to get text embeddings for a list of texts
texts = ["Sample text 1", "Sample text 2"]
embeddings = ollama_embedding._get_text_embeddings(texts)
print(embeddings)

Ensure that the base_url is correctly set to the Ollama model's host URL to avoid any unintended API calls to OpenAI [3][4].

Additionally, double-check your environment variables and configuration to ensure that no OpenAI API key is being used or required by any part of your code or its dependencies.

To continue talking to Dosu, mention @dosu.