run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.85k stars 5.28k forks source link

[Question]: how to use router query engine or retriver with open source llm #15821

Closed JINO-ROHIT closed 2 months ago

JINO-ROHIT commented 2 months ago

Question Validation

Question

When i run the examples for router query engine or retriever, it asks for openai llm, how can i use with open source llms

dosubot[bot] commented 2 months ago

I found a similar solved discussion that might help you: Exploring Structured Planning Agent-Need Help [1].

To use the router query engine or retriever with open-source LLMs instead of OpenAI LLMs, you can replace the OpenAI LLM integration with an open-source LLM integration. Here is an example of how you can modify the code to use an open-source LLM, such as Hugging Face's transformers library:

from transformers import pipeline

# Initialize the open source LLM (e.g., using Hugging Face's transformers)
llm = pipeline("text-generation", model="gpt2")

def get_formatted_prompt(query_str):
    fmt_prompt = router_prompt0.format(
        num_choices=len(choices),
        max_outputs=2,
        context_list=choices_str,
        query_str=query_str,
    )
    return fmt_prompt

query_str = "Can you tell me more about the amount of Vitamin C in apples"
fmt_prompt = get_formatted_prompt(query_str)
response = llm(fmt_prompt, max_length=100)

print(response[0]['generated_text'])

Additionally, you can leverage LlamaIndex for a more integrated approach. LlamaIndex provides a unified interface for defining LLM modules, whether from OpenAI, Hugging Face, or LangChain, which simplifies the integration process. Here’s an example of how you can use LlamaIndex with an open-source LLM:

from llama_index.llms.huggingface import HuggingFaceLLM
from transformers import AutoTokenizer, AutoModelForCausalLM

# Initialize the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

# Create the LLM object
llm = HuggingFaceLLM(model=model, tokenizer=tokenizer)

def get_formatted_prompt(query_str):
    fmt_prompt = router_prompt0.format(
        num_choices=len(choices),
        max_outputs=2,
        context_list=choices_str,
        query_str=query_str,
    )
    return fmt_prompt

query_str = "Can you tell me more about the amount of Vitamin C in apples"
fmt_prompt = get_formatted_prompt(query_str)
response = llm.complete(fmt_prompt, max_length=100)

print(response)

This example demonstrates how to use LlamaIndex to integrate an open-source LLM from Hugging Face, providing a more streamlined and flexible approach to using different LLMs in your application [2][3].

To continue talking to Dosu, mention @dosu.

JINO-ROHIT commented 2 months ago

No, im trying to run this code from the docs https://docs.llamaindex.ai/en/stable/examples/query_engine/RouterQueryEngine/

dosubot[bot] commented 2 months ago

To run the example code from the LlamaIndex documentation for the Router Query Engine using open-source LLMs instead of OpenAI's model, you can modify the code to use an open-source LLM like Hugging Face's models. Here is how you can do it:

  1. Install the necessary packages:

    pip install llama-index transformers
  2. Modify the code to use a Hugging Face model:

from transformers import AutoTokenizer, AutoModelForCausalLM
from llama_index.llms.huggingface import HuggingFaceLLM
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import PydanticSingleSelector, PydanticMultiSelector
from llama_index.core.tools import QueryEngineTool

# Initialize the Hugging Face model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

# Create the LLM object
llm = HuggingFaceLLM(model=model, tokenizer=tokenizer)

# Define the query engine tools
list_tool = QueryEngineTool.from_defaults(
    query_engine=list_query_engine,
    description="Useful for summarization questions related to the data source",
)
vector_tool = QueryEngineTool.from_defaults(
    query_engine=vector_query_engine,
    description="Useful for retrieving specific context related to the data source",
)
keyword_tool = QueryEngineTool.from_defaults(
    query_engine=keyword_query_engine,
    description="Useful for keyword-based retrieval related to the data source",
)

# Initialize the RouterQueryEngine with a Pydantic selector
query_engine = RouterQueryEngine(
    selector=PydanticSingleSelector.from_defaults(llm=llm),
    query_engine_tools=[list_tool, vector_tool, keyword_tool],
)

# Example query
response = query_engine.query("Can you tell me more about the amount of Vitamin C in apples?")
print(response)

This code replaces the OpenAI model with a Hugging Face model (gpt2 in this case) and integrates it into the RouterQueryEngine setup. Adjust the model and parameters as needed for your specific use case [1][2].

To continue talking to Dosu, mention @dosu.

JINO-ROHIT commented 2 months ago

ValueError: model name “” doesnot support function calling API

dosubot[bot] commented 2 months ago

To resolve the "ValueError: model name “” does not support function calling API" error when using the Router Query Engine with an open-source LLM, ensure that you are specifying a valid model name when instantiating the OpenRouter class. The model name should not be an empty string and should be one of the supported models listed on the OpenRouter website.

Here is an example of how to correctly instantiate the OpenRouter class with a valid model name:

from llama_index.llms.openrouter import OpenRouter

llm = OpenRouter(
    api_key="<your-api-key>",
    max_tokens=256,
    context_window=4096,
    model="gryphe/mythomax-l2-13b",  # Ensure this is a valid model name
)

response = llm.complete("Hello World!")
print(str(response))

Make sure to replace "<your-api-key>" with your actual API key and "gryphe/mythomax-l2-13b" with a valid model name if you are using a different one. The default model is "gryphe/mythomax-l2-13b", and you can find other options at OpenRouter Models [1][2].

Alternatively, if you are using the Neutrino LLM, you can specify the router instead of the model name. Here is an example:

from llama_index.llms.neutrino import Neutrino

llm = Neutrino(
    api_key="<your-neutrino-api-key>",
    router="default"  # or specify your custom router ID
)

response = llm.complete("In short, a Neutrino is")
print(f"Optimal model: {response.raw['model']}")
print(response)

Ensure you have set the NEUTRINO_API_KEY environment variable or passed it directly to the Neutrino class. The router can be the default or a custom one created in the Neutrino dashboard [3].

To continue talking to Dosu, mention @dosu.

JINO-ROHIT commented 2 months ago

ValueError: model name “llama-7b-instruct” doesnot support function calling API, this is a different error

dosubot[bot] commented 2 months ago

To resolve the "ValueError: model name 'llama-7b-instruct' does not support function calling API" error when using the Router Query Engine with an open-source LLM, you need to choose a compatible model. The llama-7b-instruct model does not support the Router Query Engine.

You can switch to a model that supports the Router Query Engine. For example, you can use llama2-13b-chat or zephyr-7b-beta, which are listed as compatible with the Router Query Engine.

Here is an example of how you can switch to a compatible model:

from llama_index.llms.huggingface import HuggingFace

# Use a compatible model
model_name = "llama2-13b-chat"  # or "zephyr-7b-beta"
llm = HuggingFace(model_name=model_name)

# Now you can use the Router Query Engine with the compatible model
resp = llm.complete("Your query here")
print(resp)

Alternatively, you can use the OpenRouter class from the LlamaIndex library, which is designed to work with models listed on OpenRouter:

from llama_index.legacy.llms.openrouter import OpenRouter

# Use a compatible model from OpenRouter
model_name = "gryphe/mythomax-l2-13b"  # Default model or choose another from https://openrouter.ai/models
llm = OpenRouter(model=model_name)

# Now you can use the Router Query Engine with the compatible model
resp = llm.complete("Your query here")
print(resp)

Make sure to install the necessary dependencies for the Hugging Face or OpenRouter integration if you haven't already:

pip install llama-index-llms-huggingface

or

pip install llama-index-legacy

This should resolve the error and allow you to use the Router Query Engine with a compatible model [1].

To continue talking to Dosu, mention @dosu.

JINO-ROHIT commented 2 months ago

dont all models support function calling?

dosubot[bot] commented 2 months ago

No, not all models support function calling. The FunctionCallingProgram only works with LLMs that natively support function calling, such as certain OpenAI models and some models available through AWS Bedrock Converse. For other LLMs, the LLMTextCompletionProgram should be used to directly prompt the model through text to get back a structured output.

For example, OpenAI models like gpt-3.5-turbo and gpt-4o-mini have native support for function calling. Additionally, models like anthropic.claude-3-haiku-20240307-v1:0 from AWS Bedrock Converse also support native function calling. However, not all models from AWS Bedrock support function calling, so it's important to check the available features of each LLM [1][2][3].

To continue talking to Dosu, mention @dosu.

logan-markewich commented 2 months ago

@JINO-ROHIT all models do NOT support function calling. It depends on the LLM class you are using supporting tool calls directly in their API

In this case, don't using the pydantic selector, use the LLM selector, which doesn't use tool calling and instead directly prompts an LLM

from llama_index.core.selectors import LLMSingleSelector, LLMMultiSelector

query_engine = RouterQueryEngine(
    selector= LLMSingleSelector.from_defaults(llm=llm),
    query_engine_tools=[
        list_tool,
        vector_tool,
    ],
)