run-llama / llama_index

LlamaIndex is a data framework for your LLM applications
https://docs.llamaindex.ai
MIT License
36.47k stars 5.21k forks source link

[Question]: nlsql table query engine keep the conversation going ( chat history) #15174

Open marouahamdi opened 2 months ago

marouahamdi commented 2 months ago

Question Validation

Question

hello, i'm working a text to sql solution with llama index , llama cpp and duck db llm. the issue that I want to keep the conversation going. I can't use the chat-engine with nlsqtablequeryengine and nlsqlretriever. plz any help ?

dosubot[bot] commented 2 months ago

Hey @marouahamdi! I'm a bot here to help you with bugs, questions, and becoming a contributor. I'm here to assist you while waiting for a human maintainer.

To maintain the conversation history while using the chat engine with nlsqltablequeryengine and nlsqlretriever in your text-to-SQL solution using Llama Index, Llama CPP, and Duck DB LLM, you can use the ContextChatEngine with a memory module like ChatMemoryBuffer. Here is an example of how to set it up:

from llama_index.legacy.memory import ChatMemoryBuffer
from llama_index.legacy.chat_engine.context import ContextChatEngine
from llama_index.legacy.core.base_retriever import BaseRetriever
from llama_index.legacy.llms.llm import LLM

# Initialize memory buffer with a token limit
memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

# Create the chat engine with memory
chat_engine = ContextChatEngine.from_defaults(
    retriever=BaseRetriever(),  # Replace with your actual retriever
    llm=LLM(),  # Replace with your actual LLM
    memory=memory,
    system_prompt=(
        "You are a chatbot, able to have normal interactions, as well as talk"
        " about the Kendrick and Drake beef."
        "Here are the relevant documents for the context:\n"
        "{context_str}"
        "\nInstruction: Use the previous chat history, or the context above, to interact and help the user."
    ),
)

# Example queries
response = chat_engine.chat("Tell me about the songs Drake released in the beef.")
print(str(response))

response = chat_engine.chat("What about Kendrick?")
print(str(response))

This setup ensures that the chat engine maintains the conversation history through the ChatMemoryBuffer, allowing it to provide contextually relevant responses based on the previous interactions. The ContextChatEngine uses a retriever to fetch context and integrates it into the system prompt, ensuring a fluid chat experience.

Additionally, the BaseIndex class in Llama Index provides methods to convert the index into different engines, including chat engines. You can use the as_chat_engine method with the ChatMode.CONTEXT to create a ContextChatEngine:

from llama_index.legacy.indices.base import BaseIndex

# Assuming you have an index instance
index = BaseIndex.from_documents(documents)

# Create a ContextChatEngine from the index
chat_engine = index.as_chat_engine(chat_mode=ChatMode.CONTEXT, memory=memory)

# Example queries
response = chat_engine.chat("Tell me about the songs Drake released in the beef.")
print(str(response))

response = chat_engine.chat("What about Kendrick?")
print(str(response))

This approach leverages the built-in capabilities of Llama Index to create a chat engine that maintains conversation history and provides contextually relevant responses [1][2][3][4][5].

To continue talking to Dosu, mention @dosu.

marouahamdi commented 2 months ago

I have something like this in my code. it s a text to sql app.
sql_database = SQLDatabase(engine, include_tables=[chosen_table])

# Initialize LlamaCPP with a local model path
llm = LlamaCPP(
    model_url=None,
    model_path="C:\\Users\\MarouaHAMDI\\models\\DuckDB-NSQL-7B-v0.1-q8_0.gguf",
    temperature=0,
    max_new_tokens=256,
    context_window=3900,
    #messages_to_prompt=messages_to_prompt,
    completion_to_prompt=completion_to_prompt,
    verbose=False,
)

# Create a ServiceContext
service_context = ServiceContext.from_defaults(llm=llm, embed_model='local:BAAI/bge-small-en-v1.5')

# Set the global service context
set_global_service_context(service_context)

# Initialize the query engine with the chosen table
query_engine = NLSQLTableQueryEngine(sql_database=sql_database, tables=[chosen_table], llm=llm, service_context= None)

context = (
"use one table of those 'table1' or 'table2' "
)

nl_sql_retriever = NLSQLRetriever(
    sql_database, tables=[chosen_table], return_raw=True, context_str=context
)

nlsqlretriever and nlsqltablequeryengine don't have methods to use chat.engine. I need to keep the conversation with the llm so for exemple if the model generate a bad sql answer the user could give him extra context so the llm can correct it's answer. thank you

dosubot[bot] commented 2 months ago

To maintain a conversation with the LLM in your text-to-SQL application, allowing the user to provide additional context and correct the SQL query, you can integrate a chat engine that retains conversation history. Here’s how you can achieve this using the CondensePlusContextChatEngine from LlamaIndex:

  1. Initialize the Chat Engine: Use CondensePlusContextChatEngine to maintain conversation history and condense it into a standalone question.

  2. Refine the Prompt: Add additional context to the text-to-SQL prompt to improve the accuracy of the generated SQL query.

  3. Validate the SQL Query: Implement a SQL validator to check the generated SQL query before execution.

Here’s an example of how you can set this up:

from llama_index.core.chat_engine.condense_plus_context import CondensePlusContextChatEngine
from llama_index.indices.struct_store.sql_query import NLSQLTableQueryEngine
from llama_index.langchain_helpers.sql_wrapper import SQLDatabase
from llama_index.indices.service_context import ServiceContext
from llama_index.legacy.memory import ChatMemoryBuffer

# Initialize SQLDatabase
sql_database = SQLDatabase(engine, include_tables=[chosen_table])

# Initialize LlamaCPP with a local model path
llm = LlamaCPP(
    model_url=None,
    model_path="C:\\Users\\MarouaHAMDI\\models\\DuckDB-NSQL-7B-v0.1-q8_0.gguf",
    temperature=0,
    max_new_tokens=256,
    context_window=3900,
    completion_to_prompt=completion_to_prompt,
    verbose=False,
)

# Create a ServiceContext
service_context = ServiceContext.from_defaults(llm=llm, embed_model='local:BAAI/bge-small-en-v1.5')

# Set the global service context
set_global_service_context(service_context)

# Initialize memory buffer with a token limit
memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

# Initialize the chat engine with necessary components
chat_engine = CondensePlusContextChatEngine.from_defaults(
    retriever=nl_sql_retriever,  # Replace with your actual retriever
    llm=llm,
    chat_history=[],
    memory=memory,
    system_prompt="Your system prompt here",
    context_prompt="Your context prompt here",
    condense_prompt="Your condense prompt here"
)

# Function to handle user input and generate refined SQL queries
def handle_user_input(user_input, chat_engine):
    # Condense the conversation history and user input into a standalone question
    condensed_question = chat_engine._condense_question(chat_engine.memory.chat_history, user_input)

    # Retrieve context based on the condensed question
    context_str, _ = chat_engine._retrieve_context(condensed_question)

    # Refine the prompt with the retrieved context
    refined_prompt = f"{condensed_question}\n\nAdditional context: {context_str}"

    # Generate and validate the SQL query
    generated_sql_query = chat_engine.llm.generate_sql(refined_prompt)
    if validate_sql_query(generated_sql_query):
        execute_sql_query(generated_sql_query)
    else:
        print("Invalid SQL query. Please provide additional context or refine the prompt.")

# Example usage
user_input = "Find the highest value record."
handle_user_input(user_input, chat_engine)

This setup ensures that the chat engine maintains the conversation history through the ChatMemoryBuffer, allowing it to provide contextually relevant responses based on previous interactions. The CondensePlusContextChatEngine condenses the conversation history and user input into a standalone question, retrieves the necessary context, and refines the prompt for the LLM to generate accurate SQL queries [1][2].

To continue talking to Dosu, mention @dosu.

marouahamdi commented 2 months ago

hey @dosu still not working this combination between nlsqtable query engine and the condense plus context chat engine. I still can't get the conversation going this is a part of my code: engine = create_engine(connection_string) metadata_obj = MetaData()

Reflect the existing database tables

metadata_obj.reflect(bind=engine)

Liste des tables disponibles

tables = ["A", "B"]

Initialize chat history

if 'chat_history' not in st.session_state: st.session_state.chat_history = []

Streamlit interface

st.title("TEXT_TO_SQL")

chosen_table = st.selectbox("Choisissez une table", tables)

Creating SQLDatabase object with the chosen table

sql_database = SQLDatabase(engine, include_tables=[chosen_table])

Initialize LlamaCPP with a local model path

llm = LlamaCPP( model_url=None, model_path="DuckDB-NSQL-7B-v0.1-q8_0.gguf", temperature=0, max_new_tokens=256, context_window=3900, completion_to_prompt=completion_to_prompt, verbose=False, ) context = ( "use one table of those 'A' or 'B' " )

Create a ServiceContext

service_context = ServiceContext.from_defaults(llm=llm, embed_model='local:BAAI/bge-small-en-v1.5')

Set the global service context

set_global_service_context(service_context)

Initialize ChatMemoryBuffer

memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

Initialize the chat engine with necessary components

chat_engine = CondensePlusContextChatEngine.from_defaults( retriever=NLSQLRetriever(sql_database, tables=[chosen_table], return_raw=True, context_str_prefix=context), llm=llm, chat_history=st.session_state.chat_history, memory=memory, system_prompt="You are an intelligent assistant that helps users generate SQL queries.", context_prompt="Based on the following context, generate a SQL query:", condense_prompt="Condense the following conversation into a standalone question:" )

Function to validate the SQL query

def validate_sql_query(sql_query): try:

Attempt to parse the SQL query

    engine.execute(sql_query)
    return True
except Exception as e:
    st.write(f"SQL validation error: {e}")
    return False

Function to execute the SQL query

def execute_sql_query(sql_query): try: result = engine.execute(sql_query) return result.fetchall() except Exception as e: st.write(f"SQL execution error: {e}") return None

Function to handle user input and generate refined SQL queries

def handle_user_input(user_input, chat_engine):

Condense the conversation history and user input into a standalone question

condensed_question = chat_engine._condense_question(chat_engine.chat_history, user_input)

# Retrieve context based on the condensed question
context_str, _ = chat_engine._retrieve_context(condensed_question)

# Refine the prompt with the retrieved context
refined_prompt = f"{condensed_question}\n\nAdditional context: {context_str}"

# Generate and validate the SQL query
generated_sql_query = chat_engine.llm.generate_sql(refined_prompt)
if validate_sql_query(generated_sql_query):
    results = execute_sql_query(generated_sql_query)
    if results:
        st.write("Résultats :")
        for row in results:
            st.write(row)
else:
    st.write("Invalid SQL query. Please provide additional context or refine the prompt.")

user_query = st.text_input("Veuillez entrer votre question (ou 'change' pour changer de table, 'exit' pour quitter)")

if st.button("Soumettre"): if user_query.lower() == 'exit': st.write("Au revoir!") st.session_state.chat_history.append("User: exit") st.session_state.chat_history.append("System: Au revoir!") elif user_query.lower() == 'change': st.session_state.chat_history.append("User: change") st.experimental_rerun() else: st.session_state.chat_history.append(f"User: {user_query}")

    handle_user_input(user_query, chat_engine)

Display chat history

st.subheader("Chat History") for entry in st.session_state.chat_history: st.write(entry)

any help plz !

dosubot[bot] commented 2 months ago

To ensure that the CondensePlusContextChatEngine works seamlessly with your NLSQLTableQueryEngine and maintains the conversation history, you need to ensure that the chat engine is correctly integrated and that the conversation history is properly managed. Here’s a refined approach to your setup:

  1. Ensure Proper Initialization: Make sure all components are correctly initialized and integrated.
  2. Handle User Input and Conversation History: Ensure that the conversation history is updated and used correctly.

Here’s an updated version of your code with these considerations:

import streamlit as st
from sqlalchemy import create_engine, MetaData
from llama_index.core.chat_engine.condense_plus_context import CondensePlusContextChatEngine
from llama_index.indices.struct_store.sql_query import NLSQLTableQueryEngine
from llama_index.langchain_helpers.sql_wrapper import SQLDatabase
from llama_index.indices.service_context import ServiceContext
from llama_index.legacy.memory import ChatMemoryBuffer
from llama_index.legacy.retrievers import NLSQLRetriever
from llama_index.legacy.llms.llm import LLM

# Initialize SQLAlchemy engine
engine = create_engine("your_connection_string")
metadata_obj = MetaData()
metadata_obj.reflect(bind=engine)

# List of available tables
tables = ["A", "B"]

# Initialize chat history
if 'chat_history' not in st.session_state:
    st.session_state.chat_history = []

# Streamlit interface
st.title("TEXT_TO_SQL")

chosen_table = st.selectbox("Choisissez une table", tables)

# Creating SQLDatabase object with the chosen table
sql_database = SQLDatabase(engine, include_tables=[chosen_table])

# Initialize LlamaCPP with a local model path
llm = LLM(
    model_url=None,
    model_path="DuckDB-NSQL-7B-v0.1-q8_0.gguf",
    temperature=0,
    max_new_tokens=256,
    context_window=3900,
    completion_to_prompt=None,
    verbose=False,
)

context = "use one table of those 'A' or 'B'"

# Create a ServiceContext
service_context = ServiceContext.from_defaults(llm=llm, embed_model='local:BAAI/bge-small-en-v1.5')

# Set the global service context
set_global_service_context(service_context)

# Initialize ChatMemoryBuffer
memory = ChatMemoryBuffer.from_defaults(token_limit=3900)

# Initialize the chat engine with necessary components
chat_engine = CondensePlusContextChatEngine.from_defaults(
    retriever=NLSQLRetriever(sql_database, tables=[chosen_table], return_raw=True, context_str_prefix=context),
    llm=llm,
    chat_history=st.session_state.chat_history,
    memory=memory,
    system_prompt="You are an intelligent assistant that helps users generate SQL queries.",
    context_prompt="Based on the following context, generate a SQL query:",
    condense_prompt="Condense the following conversation into a standalone question:"
)

# Function to validate the SQL query
def validate_sql_query(sql_query):
    try:
        # Attempt to parse the SQL query
        engine.execute(sql_query)
        return True
    except Exception as e:
        st.write(f"SQL validation error: {e}")
        return False

# Function to execute the SQL query
def execute_sql_query(sql_query):
    try:
        result = engine.execute(sql_query)
        return result.fetchall()
    except Exception as e:
        st.write(f"SQL execution error: {e}")
        return None

# Function to handle user input and generate refined SQL queries
def handle_user_input(user_input, chat_engine):
    # Condense the conversation history and user input into a standalone question
    condensed_question = chat_engine._condense_question(chat_engine.chat_history, user_input)

    # Retrieve context based on the condensed question
    context_str, _ = chat_engine._retrieve_context(condensed_question)

    # Refine the prompt with the retrieved context
    refined_prompt = f"{condensed_question}\n\nAdditional context: {context_str}"

    # Generate and validate the SQL query
    generated_sql_query = chat_engine.llm.generate_sql(refined_prompt)
    if validate_sql_query(generated_sql_query):
        results = execute_sql_query(generated_sql_query)
        if results:
            st.write("Résultats :")
            for row in results:
                st.write(row)
    else:
        st.write("Invalid SQL query. Please provide additional context or refine the prompt.")

user_query = st.text_input("Veuillez entrer votre question (ou 'change' pour changer de table, 'exit' pour quitter)")

if st.button("Soumettre"):
    if user_query.lower() == 'exit':
        st.write("Au revoir!")
        st.session_state.chat_history.append("User: exit")
        st.session_state.chat_history.append("System: Au revoir!")
    elif user_query.lower() == 'change':
        st.session_state.chat_history.append("User: change")
        st.experimental_rerun()
    else:
        st.session_state.chat_history.append(f"User: {user_query}")
        handle_user_input(user_query, chat_engine)

# Display chat history
st.subheader("Chat History")
for entry in st.session_state.chat_history:
    st.write(entry)

This code ensures that the CondensePlusContextChatEngine is properly initialized and integrated with your text-to-SQL application. It maintains the conversation history and uses it to generate contextually relevant SQL queries [1][2].

To continue talking to Dosu, mention @dosu.

marouahamdi commented 2 months ago

hi @dosu, in this code we are not using NLSQLTableQueryEngine and I think it s an important tool to convert text to sql. I m thinking maybe there no way to save the conversation and ask again because of the fact that the llm and llamaindex tools used are meant to convert only to sql so the llm could not make a conversation or use a chat engine. is that make sense?

dosubot[bot] commented 2 months ago

To integrate NLSQLTableQueryEngine into your text-to-SQL application while maintaining the conversation history, you can use the ChatMemoryBuffer to store and retrieve chat history. This ensures that the conversation context is preserved across interactions with the NLSQLTableQueryEngine.

Here’s how you can achieve this:

  1. Initialize the ChatMemoryBuffer: Create an instance of ChatMemoryBuffer with appropriate parameters such as token_limit and tokenizer_fn.

  2. Store Messages: Use the put method of ChatMemoryBuffer to store each message exchanged during the conversation.

  3. Retrieve Messages: Use the get method to retrieve the chat history when needed, ensuring that the context is maintained.

  4. Integrate with NLSQLTableQueryEngine: Pass the retrieved chat history to the NLSQLTableQueryEngine to provide context for generating SQL queries.

Here is a code example demonstrating this integration:

import streamlit as st
from sqlalchemy import create_engine, MetaData
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.llms.types import ChatMessage, MessageRole
from llama_index.llms.openai import OpenAI as OpenAiLlm
from llama_index.core.utils import get_tokenizer
from llama_index.core.indices.struct_store import NLSQLTableQueryEngine
from llama_index.core import SQLDatabase
import tiktoken

# Initialize SQLAlchemy engine
engine = create_engine("your_connection_string")
metadata_obj = MetaData()
metadata_obj.reflect(bind=engine)

# List of available tables
tables = ["A", "B"]

# Initialize chat history
if 'chat_history' not in st.session_state:
    st.session_state.chat_history = []

# Streamlit interface
st.title("TEXT_TO_SQL")

chosen_table = st.selectbox("Choisissez une table", tables)

# Creating SQLDatabase object with the chosen table
sql_database = SQLDatabase(engine, include_tables=[chosen_table])

# Initialize LlamaCPP with a local model path
llm = OpenAiLlm(
    model_name="gpt-4-0125-preview",
    max_tokens=256
)

# Initialize ChatMemoryBuffer
model = "gpt-4-0125-preview"
tokenizer_fn = tiktoken.encoding_for_model(model).encode
chat_memory_buffer = ChatMemoryBuffer.from_defaults(
    token_limit=3000,
    tokenizer_fn=tokenizer_fn
)

# Initialize the NLSQLTableQueryEngine
query_engine = NLSQLTableQueryEngine(
    sql_database=sql_database,
    tables=[chosen_table],
    llm=llm
)

# Function to handle interaction with NLSQLTableQueryEngine
def handle_query_with_memory(engine, user_input):
    # Store user message
    user_message = ChatMessage(role=MessageRole.USER, content=user_input)
    chat_memory_buffer.put(user_message)

    # Retrieve chat history
    chat_history = chat_memory_buffer.get()

    # Use chat history as context for NLSQLTableQueryEngine
    context = " ".join([msg.content for msg in chat_history])
    response = engine.query(user_input, context=context)

    # Store assistant's response
    assistant_message = ChatMessage(role=MessageRole.ASSISTANT, content=response)
    chat_memory_buffer.put(assistant_message)

    return response

# Example usage with NLSQLTableQueryEngine
user_query = st.text_input("Veuillez entrer votre question (ou 'change' pour changer de table, 'exit' pour quitter)")

if st.button("Soumettre"):
    if user_query.lower() == 'exit':
        st.write("Au revoir!")
        st.session_state.chat_history.append("User: exit")
        st.session_state.chat_history.append("System: Au revoir!")
    elif user_query.lower() == 'change':
        st.session_state.chat_history.append("User: change")
        st.experimental_rerun()
    else:
        st.session_state.chat_history.append(f"User: {user_query}")
        response = handle_query_with_memory(query_engine, user_query)
        st.write(response)

# Display chat history
st.subheader("Chat History")
for entry in st.session_state.chat_history:
    st.write(entry)

This setup ensures that the NLSQLTableQueryEngine is used to convert text to SQL while maintaining the conversation history through the ChatMemoryBuffer. This allows the LLM to provide contextually relevant responses based on previous interactions [1][2][3][4].

To continue talking to Dosu, mention @dosu.